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ABSTRACT 



The invention relates to the nitrification of wastewater and identification of microorganisms 
capable of participating in this process. Specifically, the invention provides a consortium of 
microorganisms capable of nitrite oxidation in wastewater, which consortium is enriched in members 
of the Nitrospira phylum. The invention also provides oligonucleotide primers and probes for the 
amplification or detection of Nitrospira DNA, kits comprising the primers and probes, and methods of 
detection and quantitating Nitrospira species in a sample. 
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AQUATIC NITRITE OXIDISING MICROORGANISMS 
TECHNICAL FIELD 

This invention relates to the removal of nitrogenous compounds from wastewater. In particular, 
the invention relates to an isolated consortium of microorganisms capable of nitrification of wastewater. 
5 The invention also relates to methods of identifying microorganisms capable of nitrification of 
wastewater and oligonucleotide primers and DNA probes suitable for use in the methods. 

INTRODUCTION 

The removal of nitrogenous compounds from sewage effluents is an important aspect in the 
remediation of wastewaters. The presence of ammonia, nitrite and nitrate in wastewater discharges 

10 can cause numerous problems ranging from eutrophication (Meganck and Faup,> 1988) of the 
receiving aquatic environment to aspects of public health concern such as nitrate contamination of 
drinking water. Nitrogen is biologically removed from wastewaters in a two step process of 
nitrification (ammonia oxidised to nitrate) (Randall, 1992; Robertson and Kuenen, 1991) and 
denitrification (nitrate reduced to dinitrogen gas that dissipates into the atmosphere) (Blackburn, 1983; 

15 Robertson and Kuenen, 1991). Nitrification is the first and most sensitive step of the process and can 
be further subdivided into two steps: ammonia oxidation to nitrite and nitrite oxidation to nitrate. The 
two steps are carried out by separate bacterial groups and for both groups, the total diversity of 
organisms with this phenotype is small. 

Therefore, nitrification is a process where reduced nitrogen compounds, generally ammonium 

20 (NH 4 + ), are microbiologically oxidised to nitrate (N0 3 ) via nitrite (N0 2 ) under aerobic conditions 
(Halling-Sorensen and Jergensen, 1993). The overall reactions and possible organisms responsible 
are: 

Nitrosomonas 

2NH 4 + + 30 2 ► 2NO/ + 2H 2 0 + 4H + + biomass 

Nitrobacter 

2 N0 2 - + 0 2 ► 2N0 3 - + biomass 

25 The Gram negative chemoautotrophic nitrite oxidising bacteria are physiologically distinct, as 

they all possess the ability to use nitrite as their energy source and to assimilate C0 2 , via the Calvin 
Benson cycle, as a carbon source for cell growth (Bock et aL y 1992). For each molecule of C0 2 
fixed, 100 molecules of nitrite need to be oxidized, emphasising the high energy demands placed on 
these cells. The overall stoichiometry of nitrite oxidation is (Halling-Sorensen and J0rgensen, 1993): 

30 400 NO/ + NH 4 + + 4H 2 C0 3 + HC0 3 + 195 0 2 ► C 5 H 7 N0 2 + 3H 2 0 + 400 N0 3 

These bacteria can typically also use nitric oxide (NO) instead of N0 2 " as an electron source 
(Bock et al. 9 1992). Not all of the known nitrifying bacteria are obligate chemoautotrophs. In fact, 
many strains of Nitrobacter can grow well as heterotrophs, where both energy and carbon are 
obtained from organic carbon sources, or mixotrophically (a combination of both autotrophic and 
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heterotrophic behaviour). These bacteria are collectively known as facultative chemoautotrophs. 
Therefore, bacterial strains can grow three ways; aerobically and autotrophically, aerobically and 
mixotrophically or anaerobically and heterotrophically. In mixotrophic growth, NO, is oxidized in 
preference to organic carbon substrates like acetate, pyruvate and glycerol. Both autotrophic and 
heterotrophic growth is usually slow and inefficient. 

As a generalisation, most strains of Nitrobacter seem to be able to grow faster as mixotrophs 
than as heterotrophs and faster heterotrophically or chemo-heterotrophically than 
chemoautotrophically. 

Four genera are currently recognised: Nitrobacter, Nitrospina. Nitrococcus and Nitrospira 
(Halling-Sorensen and Jorgensen, 1993). Nitrospina and Nitrococcus are unable to grow 
heterotrophically or mixotrophically (Bock et al., 1992). One species of Nitrospira, Nitrospira 
marina, can grow autotrophically and mixotrophically, (Bock et at., 1992) whereas Nitrospira 
moscoviensis is an obligate autotroph (Ehrich, et al., 1995). These nitrite oxidizers have also been 
conventionally classified based on phenotypic characters like their cell shape and the ultrastructure of 
15 their intracytoplasmic membranes. Doubling times of Nitrobacter can range from 12 to 59 hours, or 
even as long as 140 hours (Halling-Sorensen and Jorgensen, 1993). These are therefore very slow 
growing bacteria. 

In wastewater treatment systems, Nitrosomonas (an ammonia oxidizer) and Nitrobacter (a 
nitrite oxidizer) are the two autotrophs presumed to be responsible for nitrification because they are 
the commonest ammonia and nitrite oxidizers isolated from these environments (Halling-Sorensen and 
Jargensen, 1993). Although ammonia oxidizers have been intensively studied by the use of molecular 
methods (Wagner et al., 1995; Wagner et al., 1996), the nitrite oxidizers have not been similarly 
investigated. Since the microorganisms responsible for nitrite oxidation in wastewater treatment plants 
were presumed to be from the genus Nitrobacter, mathematical modeling of the process has used data 
relevant to this genus. However, fluorescent in situ hybridization (FISH) probing of activated sludge 
mixed liquors with Nitrobacter specific probes (Wagner et al., 1996) could not confirm the presence 
of these organisms suggesting that they were not responsible for this major component of nitrogen 
remediation. Indeed, Nitrobacter could not be found in other aquatic environments (Hovanec and 
DeLong, 1996) when specific FISH probes were employed. It was speculated that other bacteria were 
30 likely responsible for nitrite oxidation (Hovanec and DeLong, 1996; Wagner et al. , 1 996). 

Knowledge of the microorganisms responsible for nitrification of wastewater is desirable for 
the efficient management of treatment systems. It would also be advantageous to have available 
biomass which can be added to a system to implement or improve nitrification. However, as 
indicated above, there is no certainty in the art as to the actual microorganisms responsible for 
35 nitrification nor are there methods available for identifying such organisms. 



20 



25 



BNSDOCID: <CA Z252064A1_I_> 



CA 02252064 1998- 1 I -20 



3 

SUMMARY OF THE INVENTION 
It is an object of the invention to provide a consortium of microorganisms that can be used for 
nitrification of wastewater. 

A further object of the invention is to provide a method of identifying microorganisms capable 
5 of nitrification of wastewater. 

According to a first embodiment of the invention, there is provided a consortium of 
microorganisms capable of nitrite oxidation in wastewater, which consortium is enriched in members 
of the Nitrospira phylum. 

According to a second embodiment of the invention, there is provided an oligonucleotide 
10 primer for PCR amplification of Nitrospira DNA, said primer comprising at least 12 nucleotides 
having a sequence selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

15 According to a third embodiment of the invention, there is provided a primer pair for PCR 

amplification of Nitrospira DNA, said primer pair comprising: 

(a) a first oligonucleotide of at least 12 nucleotides having a sequence selected from one 
strand of a bacterial 16S rDNA gene; and 

(b) a second oligonucleotide of at least 12 nucleotides having a sequence selected from the 
20 other strand of said 16S rDNA gene downstream of said first oligonucleotide sequence; wherein at 

least one of said first and second oligonucleotides is selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one SEQ ID NO: 1 to SEQ ID 
NO: 13. 

25 According to a fourth embodiment of the invention, there is provided a probe for detecting 

Nitrospira DNA, said probe comprising at least 12 nucleotides having a sequence selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

30 According to a fifth embodiment of the invention, there is provided a kit comprising: 

at least one primer according to the second embodiment; 
at least one primer pair according to the third embodiment; or 
at least one probe according to the fourth embodiment. 

According to a sixth embodiment of the invention, there is provided a method of detecting a 
35 Nitrospira species in a sample, said method comprising the steps of: 
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(a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to the 
third embodiment; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
5 produce an amplification product; and 

(d) detecting said amplification product. 

According to a seventh embodiment of the invention, there is provided a method of 
quantitating the level of a Nitrospira species in a sample, said method comprising the steps of: 
(a) lysing cells in said sample to release genomic DNA; 

1 0 (b > contacting denatured genomic DNA from step (a) with a primer pair according to the 

third embodiment; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

(d) detecting said amplification product and quantitating the level of said product by 
1 5 comparison with at least one reference standard. 

According to an eighth embodiment of the invention, there is provided a method of detecting a 
Nitrospira species in a sample, said method comprising the steps of: 

(a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a labeled probe according to 
the fourth embodiment under conditions which allow hybridisation of said genomic DNA said probe; 

(c) separating hybridised labeled probe and genomic DNA from unhybridised labeled 
probe; and 

(d) detecting said labeled probe-genomic DNA hybrid. 

According to a ninth embodiment of the invention, there is provided a method of detecting 
25 cells of a Nitrospira species in a sample, said method comprising the steps of: 

(a) treating cells in said sample to fix cellular contents; 

(b) contacting said fixed cells from step (a) with a labeled probe according to the fourth 
embodiment under conditions which allow said probe to hybridise with RN A within said fixed cell ; 

(c) removing unhybridised probe from said fixed cells; and 
30 (d) detecting said labeled probe-RNA hybrid. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a graph showing influent and effluent NO r N concentrations for an automated 
laboratory-scale reactor operating as a sequencing batch reactor at 2 cycles/day with strong selection 
for nitrite oxidising biomass (NOSBR). 
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Figure 2 is a graph showing influent and effluent N0 2 -N concentrations of the NOSBR 
operating at 4 cycles/day. 

Figure 3 is a graph of mixed liquor nitrite-N concentrations during the react period of the 
NOSBR cycle for attached growth and for suspended growth. 
5 Figure 4 is a graph showing nitrite-N and nitrate-N concentrations in the mixed liquor during 

the react period of the NOSBR. 

Figure 5 ia a graph showing mixed liquor nitrite-N concentrations during the react period in 
three stages of the NOSBR operated at 2 cycles/day with different concentrations of nitrite in the feed. 

Figure 6 is a graph of mixed liquor nitrite-N concentrations during the react period in three 
1 0 representative cycles during operation of the NOSBR at 4 cycles/day. 

Figure 7 is an evolutionary distance tree derived from a comparison of 16S rDNA sequences 
from nitrite oxidising bacteria and clone sequences from three different 16S rDNA clone libraries 
(RC, GC, and SBR). 

Figure 8 is an alignment of sequences of 16S rDNA from Nitrospira clones identified in a 
15 nitrite-oxidising SBR and from other sources. 

Figure 9 depicts the results of agarose gel electrophoresis of PCR-amplified DNA using 
genomic DNA from various Nitrospira clones as template. 

BEST MODE AND OTHER MODES OF CARRYING OUT THE INVENTION 

The following abbreviations are used hereafter: 
20 SBR sequencing batch reactor 

NOSBR nitrite oxidising SBR 

NOM nitrite oxidising medium 

HRT hydraulic retention time 

MLSS mixed liquor suspended solids 

25 BNR biological nutrient removal 

DO dissolved oxygen 

PCR polymerase chain reaction 

REA restriction enzyme analysis 

OTU operational taxonomic unit 

30 bp(s) base pair(s) 

The one-letter code for nucleotides in DNA conforms to the IUPAC-IUB standard described 
in BiochemicalJournal 219, 345-373 (1984). 

The term "comprise", or variations of the term such as "comprises" or "comprising", are 
used herein to denote the inclusion of a stated integer or stated integers but not to exclude any other 
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integer or any other integers, unless in the context or usage an exclusive interpretation of the terms is 
required. 

The present inventors have developed a specific nitrifying biomass that is largely comprised of 
bacteria that are most closely related to Nitrospira moscoviensis . It is believed that a range of species 
5 of Nitrospira are involved in the process. The inventors have shown that these bacteria are likely to 
be more dominant in reactors with good nitrification performance than bacteria from the genus 
Nitrobacter. A range of studies have failed to find Nitrobacter in nitrifying processes (Hovanec & 
DeLong, 1996; Wagner et al., 1996) and evidence is provided below that the organisms responsible 
for this important biochemical reaction in wastewater treatment processes (both suspended and 
10 attached growth processes) are from the Nitrospira phylum in the domain Bacteria. 

With reference to the first embodiment of the invention, the nitrifying biomass can be 
produced by presenting a feed comprising nitrite, dissolved oxygen and dissolved carbon dioxide but 
which is free of organic carbon to seed sludge from any sewage plant exhibiting nitrification. The 
seed sludge is advantageously from a domestic wastewater treatment plant but can also be from an 
abattoir wastewater treatment plant. The nitrite component of the feed can be as low as about 400 
mg/L nitrite-N. The oxygen and carbon dioxide can conveniently be provided as air bubbled through 
the solution. 

Turning to the second embodiment of the invention, oligonucleotide primers typically have a 
length of about 12 to 50 nucleotides. A preferred length is 12 to 22 nucleotides. Particularly 
20 preferred primers are the following: 

5 ' CGGGAGGGAAGATGGAGC 3 ' (SEQ ID NO: 14) 

5 1 CCAACCCGGAAAGCGCAGAG 3 ' (SEQ ID NO: 1 5) 

5 ' AGCCTGGCAGTACCCTCT 3' (SEQ ID NO: 1 6) 

Oligonucleotide primer pairs according to the third embodiment of the invention comprise an 
oligonucleotide primer that will anneal to one strand of the target sequence and a second oligonucleotide 
primer which will anneal to the other, complementary, strand of the target sequence. It will be 
appreciated that the second oligonucleotide primer must anneal to the complementary strand downstream 
of the first oligonucleotide primer sequence, which occurs in the complementary strand, to yield a double 
stranded amplification product in the PGR. The amplification product is of a size that facilitates 
detection. Typically, the first and second oligonucleotide primer sites in the target DNA are separated 
by 50 to 1 ,400 bps. A preferred separation is 400 to 1 ,000 bps. 

The probes of the fourth embodiment, as indicated above, can have a size as small as 12 
nucleotides. Typically, however, probes have a length of 15 to 50 nucleotides. A preferred probe 
length is 15 to 22 nucleotides, particularly for in situ hybridisation according to the method of the ninth 
35 embodiment. 
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The oligonucleotide primers included in kits according to the fifth embodiment of the invention 
can be individual oligonucleotide primers appropriate for the detection of Nitrospira or a primer pair. 
Oligonucleotide primer pairs are advantageously provided as compositions. Additional oligonucleotide 
primers can also be included in kits for use in control reactions. For detection purposes, DNA probes 
can also be included in kits. 

Kits according to the fifth embodiment of the invention can further comprise reagents used in 
PCR and hybridisation reactions. Such reagents include buffers, salts, detergents, nucleotides and 
thermostable polymerase. Such reagents are advantageously provided as solutions to facilitate execution 
of PCR or hybridisation. Solutions can be compositions comprising a number of reagents as is well 
known in the art. 

The general techniques used in the methods of the sixth to ninth embodiments, and factors to 
be considered in selecting PCR primers and probes, will be known to those of skill in the art. Such 
techniques are described, for example, in Sambrook et aL (1989) and Stackebrandt and Goodfellow 
(1991), the entire contents of which are incorporated herein by cross reference. Particularly relevant 
chapters in Stackebrandt and Goodfellow are Chapter 7, "The Polymerase Chain Reaction" by S. 
Giovannoni, and Chapter 8, "Development and Application of Nucleic Acid Probes" by D. A. Stohl 
and R. Amann. 

Non-limiting examples of the invention will now be provided. 
General Methods 

The total community DNAs from the NOSBR sludge (RC) and the seed sludge (GC) were 
isolated, the 16S rDNAs were polymerase chain reaction (PCR) amplified and cloned using previously 
published methods (Blackall, 1994; Blackall et aL, 1994; Bond et aL, 1995). Inserts from 102 clones 
in the RC library were amplified and grouped by Haelll restriction enzyme digestion banding profiles 
(REA) into operational taxonomic units (OTUs) (Weidner et aL, 1996). Clone inserts from 
representatives of RC OTUs and all 77 clones from the GC library were PCR amplified and partially 
sequenced (Blackall, 1994) using 530f (Lane, 1991) primer. Inserts from a selection of clones were 
fully sequenced (Blackall, 1994). Sequence data were analysed according to previously published 
methods (Blackall et aL, 1994) which included BLAST (Altschul et aL, 1990) comparisons and 
phylogenetic analyses (Felsenstein, 1993). 

Example 1 
Selection of a Nitrifying Biomass 
In this example, we describe the use of a laboratory-scale reactor as a sequencing batch 
reactor (SBR) with strong selection for a nitrite oxidising biomass. Seed sludge was from the 
Merrimac domestic wastewater treatment plant operated by the Gold Coast City Council and located 
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at Merrimac, Queensland 4226, Australia. The reactor set-up will be hereafter referred to as the 
"Nitrite Oxidising SBR'\ or "NOSBR". 

Reactor. A laboratory chemostat with a working volume of 1 L was operated in the dark at 
24°C as the NOSBR. The influent nitrite oxidising medium (NOM) was a synthetic waste water mix 
comprising per L: 400 to 3,200 mg KN0 2 , 3.75 g MgS0 4 .7H 2 0, 250 mg CaCl 2 .2H 2 0, 10 g 
KH 2 P04, 10 g K 2 HP0 4 , 200 mg FeS0 4 .7H 2 0, and 20 g NaHC0 3 . The pH of the medium was 
adjusted to 7.0, but the reactor was not equipped with pH control. Dissolved oxygen was maintained 
at 1.6-2.0 mg/L and C0 2 was introduced by bubbling air through the liquid in the NOSBR. Surface 
biomass growth was precluded by regular scrubbing of all solid surfaces with a brush. Four cycles 
per day giving a hydraulic retention time (HRT) of 12 hr were performed with the following 
sequences :- 

1 ) Feed of 500 ml of fresh medium - 30 min (0 to 0.5 hr) 

2) React (aeration) - 4.5 hr (0.5 to 5 hr) 

3) Settle - 40 min (5 to 5.7 hr) 

15 4) Decant 500 ml of supernatant - 20 min (5.7 to 6 hr) 

5) Total time per cycle - 6 hr. 

Automatic timers controlled the magnetic stirrer (100 rpm), peristaltic pumps (feed and decant), 
and air pump for the cycles. Sludge biomass was not wasted from the reactor, but periodically, 
biomass was collected for testing which facilitated maintenance of a relatively steady amount of 

20 biomass in the SBR. 

At start up, 1 L of mixed liquor suspended solids (MLSS) from a full scale Biological Nutrient 
Removal (BNR, nitrogen and phosphorus removal) plant was added to the NOSBR which was 
operated manually with the NOM. Initial, manual and then automatic operation with 2-cycles per day 
(feed - [500 ml] 40 min; react - 10 hr; settle - 40 min; and decant [500 ml] - 40 min) occurred for 

25 some months before initiation of the 4-cycles per day scheme (see above). 

Monitoring. Chemical analyses of feed, mixed liquor and effluent were regularly done for 
nitrite-N (N0 2 -N), nitrate-N (N0 3 -N), and ammonium-N (NH/-N) using spectrometry assays 
(Merck, Melbourne, Australia). To preclude the removal of excessive biomass, these analyses were 
done with 2 ml samples. The MLSS of the NOSBR was determined in duplicate 10 ml samples of 

30 mixed liquor. These were filtered onto pre-dried Whatman GF/C filters, and then dried to a constant 
weight at 105 degree C. A pH meter was used to periodically monitor pH in the mixed liquor and 
effluent. A portable dissolved oxygen (DO) meter and probe were used to periodically monitor the 
DO in the NOSBR. 

Results of operation. Varying influent nitrite levels were employed to study a range of features 
35 of the selected nitrite oxidising biomass. The operating data for the influent and effluent nitrite levels 
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of the NOSBR during the automated 2 cycles/day period are presented in Figure 1 and for the 
automated 4 cycles/day in Figure 2. The data presented in these figures show that the microbial 
community are able to remove all the nitrite from the influent in a matter of hours. 
Attributes of the NOSBR mixed liquor 
5 L Suspended versus attached growth - 2 cycles/day. To generate attached growth, the regular 

scrubbing regime of the reactor was suspended for two weeks. The vast bulk of the biomass was then 
attached to surfaces in the reactor. The little remaining suspended biomass was discharged from the 
reactor which was then filled with 1 L of half strength NOM. Regular sampling and nitrite analyses 
were done during the react period of one cycle with all the biomass attached to the reactor surfaces. 

10 The results of this experiment are presented in Figure 3. The results show that suspended biomass has 
twice the nitrite oxidation rate than the attached biomass but both systems are effective in removing 
nitrite from the influent. 

Following the experiment described in the previous paragraph, the biomass was completely 
scrubbed from the surfaces to the liquid. The reactor was operated for two cycles with biomass 

15 scrubbing. A similar one-cycle study was performed as with the attached growth but with all biomass 
suspended. The biofilm growth exhibited a nitrite oxidation rate of 29 mg NO r N/hr and the 
suspended growth form showed a rate of 58 mg N0 2 -N/hr. It was assumed that the biomass 
concentration was the same for both studies since none had been removed between them. 

2. pH correlation with nitrification. It was observed that when the pH of the effluent fell 
20 below 7.4, nitrite-N was present in the effluent. If the pH rose above 7.4 for short periods, no effect 

to nitrification was observed. Therefore, pH values below 7.4 were detrimental to nitrification. 

3. Cyclic studies. Figure 4 shows the results for periodic measurements of nitrite-N and 
nitrate-N during the react period of the reactor during 2 cycles/day The results presented in these 
figures show that the bacterial population in the reactor oxidised nitrite to nitrate in a stoichiometric 

25 manner with 160 mg/1 of nitrite-N being oxidised to 160 mg/1 of nitrate-N (1 70 mg/1 at the start of the 
react period and 330 mg/1 when the nitrite-N was exhausted). The rate of nitrite oxidation and nitrate 
production also appeared to be linear, showing that the oxidation process was not limited by any 
external factors. 

Studies measuring nitrite reaction in the reactor are shown for both 2 cycles/day (Figure 5) 
30 and 4 cycles/day operation (Figure 6). The significance of these results is that the biomass is robust in 
its capacity to oxidise nitrite under a range of operating conditions. 

Example 2 
The Microbiology of the NOSBR 
In this example, we describe the microbiological characterisation of the nitrifying 
35 microorganisms present in the biomass selected in the NOSBR described in Example 1 . Methods used 
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in the characterisation have been described by Blackall (1994) and Bond et al. (1995), the entire 
contents of which disclosures are incorporated herein by cross-reference. 

Total microbial community DNA from both the seed BNR sludge (GC) and from the reactor 
after six months of operation (RC) was obtained. The 16S rDNA from each DNA extract were 
5 separately amplified by polymerase chain reaction (PCR), and then for each, clone libraries were 
prepared (Blackall, 1994; Bond et al., 1995). 

Inserts from a total of 77 clones from the GC clone library were partially sequenced with the 
primer 530f and phylogenetically analysed (Blackall et al., 1994) (Table 1). The majority of the clone 
sequences grouped with the proteobacterial phylum, while 4% (3 clones; GC3, GC86 and GC109) 
10 grouped with the phylum Nitrospira. 

Table 1 

Phyla from the Domain Bacteria Represented in the GC Clone Library 



15 



20 



Phylum in Domain Bacteria ~ Percentage in clone library 



Proteobacteria 




Alpha 


5 


Beta 


29 


gamma 


18 


delta 


4 


High mol%G + C Gram positives 


10 


Low mol%G + C Gram positives 


7 


Flexibacter/Cytophaga/Bacteroides 


5 


Nitrospira 


4 


Planctomycetales 


9 


Unaffiliated 


9 



Restriction Enzyme Analysis (REA) of the RC library was done to group clones into 
operational taxonomic units (OTUs) in advance of partial or complete clone insert sequencing 
(Weidner et al., 1996). Thirteen different OTUs were found when HaeWl was employed as the 
restriction enzyme to digest the inserts from 102 clones. The large majority of the clone inserts (88% 
or 90 clones) were found in one OTU while the remaining 12% (12 clones) comprised individuals in 
12 other OTUs. Each of the clone inserts from the latter 12 OTUs and six of the large former group 
(RC7, RC11, RC16, RC25, RC73, and RC99) were partially sequenced and phylogenetically 
analysed. These six and one of the other OTUs (RC90) were found to have partial insert sequences 
that phylogenetically grouped with the Nitrospira phylum. From this analysis, it was concluded that 
91 clones or 89% of the clone library originated from bacteria in the Nitrospira phylum. In the 
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phylogenetic analysis, one of the other OTUs (RC44) grouped with Nitrobacter. It was concluded that 
the organisms responsible for nitrification in the NOSBR were likely to be from the Nitrospira 
phylum. 

Near complete insert sequence analyses were done for the following clones: 
5 - six RC clones of the original partial sequences - RC7, RC11, RC25, RC73, RC90, and RC99 
(RC 16 omitted); 

two RC clones from the Nitrospira OTU (RC14 and RC19); 
one of the three GC Nitrospira clones (GC86); and 

four clones from a clone library prepared by Bond et al. (1995) that phylogenetically grouped in 
1 0 the Nitrospira phylum. 

The data were phylogenetically analysed as shown in Figure 7. The two clone clades would 
likely comprise two separate species with the RC clones possibly comprising more than one species. 

Sequences of clones from the two Nitrospira clades were subjected to direct pairwise sequence 
comparison. The results of this comparison are presented in Table 2. The table is a similarity matrix 
15 showing the percent similarity between 16S rDNA sequences of Nitrospira moscoviensis, Nitrospira 
marina and 13 near complete sequences from clone inserts from a full scale biological nutrient 
removal activated sludge plant (GC86), from the NOSBR (RC clone numbers) and from clones for 
which the partial sequences had been previously reported (SBR clones; Bond et al., 1995). The 
similarity matrix showed that the first clade (SBR1015, SBR 1024, SBR2046, GC86) had an average 
20 16S rDNA comparison value of 99.4% while for the second clade (RC7, RC1 1, RC14, RC19, RC25, 
RC73, RC90, RC99, SBR2016), this value was 98.7%. The highest comparative value between an 
RC clone sequence and N. moscoviensis was 93.4% for RC25. From the sequence data analysis, the 
two clone clades would likely comprise two separate species, with the RC clones possibly comprising 
more than one species. 

25 Sequence data for the SBR, GC and RC clones are presented in Figure 8. In this figure, 

sequences are divided into blocks with numbers given in square brackets above each block. The clone 
identification is given at the left of a line of sequence in each block. Dashes represent unknown 
nucleotides while full stops represent alignment breaks. 
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The sequences of clones are also presented as sequence listings as follows: 



Clone Sequence Listing Number 

SBR1024 1 

SBR1015 2 

GC86 3 

SBR2046 4 

RC25 5 

RC19 6 

SBR2016 7 

RC7 8 

RC14 9 

RC99 1 0 

RC11 11 

RC73 12 

RC90 13 
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Example 3 
Identification of Nitrospira Species 
Primers for use in a diagnostic PCR for the Nitrospira moscoviensis clade of Figure 7 (see 
Example 2) were designed from aligned sequence datasets (see Tables 3-5 below ). 
5 Table 3 is an alignment of 16S rDNA sequences of Nitrospira phylum members and nitrite 

oxidisers from other bacterial phyla which was used to design the primer MOS457f (SEQ ID NO: 14) 
for the Nitrospira moscoviensis clade. In the table, mismatches with the primer sequence are in bold 
type and are underlined. The melting temperature calculated for MOS457f was 60°C and a fragment 
size of approximately 1052 nucleotides was calculated in a PCR with primer 1492r. The MOS457f 
10 sequence corresponds to the sequence at positions 440 to 457 of the E. coli 16S rDNA gene. 

Table 3 

Source of Sequence and Number of Sequence in ~~ Sequence Mismatches 
Sequence Listings 



MOS457f primer (SEQ ID NO: 14) 


CGGGAGGGAAGATGGAGC 


- 


Nitrococcus mobilis (SEQ ID NO: 17) 


CAGCCGGGAGGAAAAGCA 


10 


Magnetobacterium bavaricum (SEQ ID NO: 1 8) 


TGTAGGGAAAGATGATGA 


8 


Nitrobacter hamburgensis (SEQ ID NO: 19) 


TGTGCGGGAAGATAATGA 


7 


Nitrospina gracilis (SEQ ID NO: 20) 


CGGGTGGGAAGAACAAAA 


6 


Nitrospira marina (SEQ ID NO: 21) 


CATGAGGAAAGATAAAGT 


6 


SBR1015 (SEQ ID NO: 22) 


CGGCAGGGAAGATGGAAC 


2 


SBR1024 (SEQ ID NO: 22) 


CGGCAGGGAAGATGGAAC 


2 


SBR2016 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


SBR2046 (SEQ ID NO: 24) 


CCGCAGGGAAGATGGAAC 


3 


RC7 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC11 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC14(SEQIDNO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC19 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC25 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC73 (SEQ ID NO: 25) 


CGGGAGGGAAGATGGAAC 


1 


RC90 (SEQ ID NO: 25) 


CGGGAGGGAAGATGGAAC 


1 


RC99 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC44 (Nitrobacter clone) (SEQ ID NO: 26) 


CGTGCGGGAAGATAATGA 


6 


GC86 (SEQ ID NO: 27) 


CGGCAGGGAAGATGGAAC 


2 


Nitrospira moscoviensis (SEQ ID NO: 28) 


CGGGAGGGAAGATGGACG 


2 
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Like Table 3, Table 4 is an alignment of 16S rDNA sequences of Nitrospira phylum members 
and nitrite oxidisers from other bacterial phyla which was used to design the primer MOS638f (SEQ 
ID NO: 15) for the Nitrospira moscoviensis clade. Again, mismatches with the primer sequence are 
in bold and are underlined. The calculated melting temperature for this primer was 66°C and a 
5 fragment size of approximately 873 nucleotides was calculated in a PCR with primer 1492r. The 
MOS638f sequence corresponds to the sequence at positions 619 to 638 of the E. coli 16S rDNA 
gene. 

Table 4 



Source of Sequence and Number of Sequence 


Sequence 


Mismatches 


in Sequence Listings 






MOS638f primer (SEQ ID NO: 15) 


CCAACCCGGAAAGCGCAGAG 




Nitrococcus mobilis (SEQ ID NO: 29) 


TCAACCTGGGAATTGCATCC 


8 


Magnetobacterium bavaricum 


TCAACCCGGGAATTGCCTTG 


7 


(SEQ ID NO: 30) 






Nitrobacter hamburgensis (SEQ ID NO: 31) 


TCAACTCCAGAACTGCCTTT 


1 1 


Nitrospina gracilis (SEQ ID NO: 32) 


TCAACCGTGGAATTGCGTTT 


10 


Nitrospira marina (SEQ ID NO: 33) 


TTAACCGGGAAAGGTCGAGA 


9 


SBR1015 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


SBR1024 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 






i 
■ 


SBR2046 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


RC7 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC11 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC14(SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC19(SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC25 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC73 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC90 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC99 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC44 (Nitrobacter clone) (SEQ ID NO: 37) 


TCAACTCCAGAACTGCCTTT 


11 


GC86 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


Nitrospira moscoviensis (SEQ ID NO: 38) 


CCAACCCGGAAAGCGCAGAG 


0 



10 Table 5, is again an alignment of 16S rDNA sequences of Nitrospira phylum members and 

nitrite oxidisers from other bacterial phyla which was used to design the primer MOS635r (SEQ ID 
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NO: 16) for the Nitrospira moscoviensis clade. The melting temperature calculated for this primer 
was 58°C and a fragment size of approximately 625 nucleotides was calculated in a PCR with primer 
27f. The MOS635r sequence corresponds to the sequence at positions 635 to 652 of the E. coli 16S 
rDNA sequence. 

Table 5 

Sequence Mismatches 



Source of Sequence and Number of Sequence in 
Sequence Listings 



MOS635r primer (SEQ ID NO: 16) 

Nitrococcus mobilis (SEQ ID NO: 39) 

Magnetobacterium bavaricum (SEQ ID NO: 40) 

Nitrobacter hamburgensis (SEQ ID NO: 41) 

Nitrospina gracilis (SEQ ID NO: 42) 

Nitrospira marina (SEQ ID NO: 43) 

SBR1015 (SEQ ID NO: 44) 

SBR1024 (SEQ ID NO: 44) 

SBR2016 (SEQ ID NO: 45) 

SBR2046 (SEQ ID NO: 44) 

RC7 (SEQ ID NO: 46) 

RC11 (SEQ ID NO: 45) 

RC14 (SEQ ID NO: 45) 

RC19 (SEQ ID NO: 45) 

RC25 (SEQ ID NO: 47) 

RC73 (SEQ ID NO: 45) 

RC90 (SEQ ID NO: 45) 

RC99 (SEQ ID NO: 45) 

RC44 (Nitrobacter clone) (SEQ ID NO: 48) 

GC86 (SEQ ID NO: 44) 

Nitrospira moscoviensis (SEQ ID NO: 49) 



AGCCTGGCAGTACCCTCT " - 

AGCCAAACAGTATC GGAT 7 

AG TTAAA CAGTTTTC AAG 1 1 

AGACCTTCAGTATC AAAG 9 

AGCCGAATAGTTTC AAAC 10 

AGC TGAAT AGTTCC TCTC 10 

AGCCGAGCAGTCCCCTCC 4 

AGCCGAGCAGTCCCCTCC 4 

AGCCTGGCAGTACCCTCT 0 

AGCCGAGCAGTCCCCTCC 4 

AGCCTGGCAGTACCCCCT 1 

AGCCTGGCAGTACCCTCT 0 

AGCCTGGCAGTACCCTCT 0 

AGCCTGGCAGTACCCTCT 0 

AGCCTGGCAGTACCGTCT 1 

AGCCTGGCAGTACCCTCT 0 

AGCCTGGCAGTACCCTCT 0 

AGCCTGGCAGTACCCTCT 0 

AGAJCCTCAGTATC AAAG 10 

AGCCGAGCAGTCCCCTCC 4 

AGCCTGGCAGTACCCTCT 0 



The three primers defined above in Tables 3 to 5 were included in separate primer pairs which 
pairs were then tested in PCR amplifications using genomic DNA from various Nitrospira clones as 
template. The PCRs were carried out according to methods detailed in Sambrook et al. (1989) at an 
10 annealing temperature of 62 °C. 

The results of electrophoretic analysis of PCRs on an agarose gel are presented in Figure 9. 
Details of the material analysed in each lane of the gel are given in Table 6. The marker DNA was 
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Haelll -digested (J>X174 DNA. The sizes of the <|)XI74 fragments are given on the left-hand side of the 
figure. 

Table 6 



Lane 


Primer pair used 


Mismatches between 
primer and template 


1 


(//aelll-digestea <pXl74 DNA) 




-> 
z 




0 mismatches with MOS457f 


J 


MUS457I, 1492r 


1 mismatch with MOS457f 


4 


MOS45/I, 1492r 


2 mismatches with MOS457f 


5 


(ttaelll-digested (|>X174 DNA) 




6 


MOS638f, 1492r 


0 mismatches with MOS638f 


7 


MOS638f, 1492r 


1 mismatch with MOS638f 


8 


MOS638f, 1492r 


3 mismatches with MOS638f 


9 


(tfaelll-digested ())X174 DNA) 




10 


MOS635r, 27f 


0 mismatches with MOS635r 


11 


MOS635r, 27f 


1 mismatch with MOS635r 

v. 


12 


MOS635r, 27f 


4 mismatches with MOS635r 



5 The results presented in Figure 9 show that an amplicon of the appropriate size was obtained 

in reactions where there was up to one mismatch between a primer and the template but that no 
amplicon was produced where there was a greater degree of mismatch. 

When the three primer pairs used for the results presented in Figure 9 were used with clone 
RC44 (closest match to Nitrobacter), no amplicons were produced. 

10 The primer NIT3 (Wagner et al. 1996; SEQ ID NO: 50) was used in a diagnostic PCR for 

Nitrobacter. NIT3 was designed originally for fluorescent in situ hybridisation experiments. The 
specificity of this primer can be appreciated from the sequence alignment presented in Table 7 which 
is an alignment of 1 6S rDNA sequences of Nitrospira phylum members and nitrite oxidisers from 
other bacterial phyla against NIT3. A melting temperature of 60°C was calculated for NIT3 and a 

15 fragment size of approximately 1020 nucleotides in a PCR with primer 27f as experimentally 
determined. The NIT3 sequence corresponds to the sequence at positions 1031 to 1048 of the E.coli 
16S rDNA gene. 
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Table 7 

Source of Sequence and Number of Sequence in Sequence Mismatches 

Sequence Listings 



N1T3 primer (SEQ ID NO: 50) 


CCTGTGCTCCATGCTCCG 


- 


Nitrobacter hamburgensis (SEQ ID NO: 51) 


CCTGTGCTCCATGCTCCG 


0 


Nitrospina gracilis (SEQ ID NO: 52) 


CCTGTGCAAGGGCCCCGA 


9 


Nilrococcus mobilis (SEQ ID NO: 53) 


CCTGTCATCCGGTTCCCG 


7 


Nitrospira moscoviensis (SEQ ID NO: 54) 


CCTGAGCACGCTGGTATT 


8 


Nitrospira marina (SEQ ID NO: 55) 


CCTGAGCTCGCTCCCCTT 


7 


Magneiobacterium bavaricum (SEQ ID NO: 56) 


CCTGTGCAAGCTCTCCCT 


8 


SBR1015 (SEQ ID NO: 57) 


CCTGAGCAGGATGGTATT 


8 


SBR1024 (SEQ ID NO: 57) 


CCTGAGCAGGATGGTATT 


8 


SBR2016 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


SBR2046 (SEQ ID NO: 57) 


CCTGAGCAGGATGGTATT 


8 


RC7 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC11 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC14 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC19 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC25 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC73 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


RC90 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 


GC86 (SEQ ID NO: 59) 


CCTGAGCAGGATGGTGTT 


8 


RC99 (SEQ ID NO: 58) 


CCTGAGCACGCTGGTATT 


8 



Results of PCRs with the primer pair NIT3 and 27f showed that the NIT3 primer specifically 
amplified only RC44 clone inserts {Nitrobacter) and not those from Nitrospira clones. 

The different primer pairs were then used with DNAs extracted from sludges and the results 
are tabulated below in Table 8. The scorings presented in the table were generated by quantitating by 
eye the intensity of the amplificate in a stained gel. A definition of the scoring follows: - = no band; 
+/. = V ery faint band; + through + + + + = increasing intensity of the amplificate. 
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Table 8 



Wastewater Treatment Plant 


Performance 


MOS635r-27f 
620 bp 


NIT3-27f 
1020 bp 


OxJey 


Full nitrification 


+ + + + 


+ + 


Merrimac 


Full nitrification 


+ + + + 


+ + 


Loganholme 


Full nitrification 


+ + + 


+/- 


Gibson Island 


Full nitrification 


+ + + 




Fairfield 


No nitrification 


+ /- 


+ + + 


r" 1 ar\r\r\n Hill 


Full nitrification 

ft Ull IlllilllVUUwll 


+ 


-4- 
i 


NOSBR 


N0 2 oxidation 


+ + + + + 


+ + + + 


Saline waste water BNR SBR 


Partial nitrification 


+ /- 


+ + 


Nitrifying biofilm reactor 


Full nitrification 


+ + + + 


+ + + + 


Phenol/cyanide removing SBR 


No nitrification 


+ /- 


+ + 


BNR SBR 


Full nitrification 


+ 


+ 



These results show that in plants having good nitrification, Nitraspira species were present as 
evidenced by amplification of target DNA with the selected primer pairs. 



BNSOOCID: <CA 2252064A1_I_> 



CA 02252064 1998-11-20 



10 



20 

REFERENCES 

Blackall, L. L. (1994). Molecular identification of activated sludge foaming bacteria. Water Science 
and Technology 29-7 ', 35-42. 

Blackall, L. L., Seviour, E.M., Cunningham, M.A., Seviour, R.J., and Hugenholtz, P. (1994). 
"Microthrix parvicella" is a novel, deep branching member of the actinomycetes subphylum. 
Systematic and Applied Microbiology 17, 513-518. 

Blackburn, T. H. (1983). The Microbial Nitrogen Cycle. In Microbial Geochemistry, (pp. 63-89). 
Edited by W. E. Krumbein. Oxford: Blackwell Scientific Publications. 

Bock, E., Koops, H., Ahlers, B., and Harms, H. (1992). Oxidation of inorganic nitrogen compounds 
as energy source. In The Prokaryotes - A Handbook on the Biology of Bacteria: Ecophysiology. 
Isolation, Identification, Applications, pp. 414-430. Edited by A. Balows, H. G. Triiper, M. 
Dworkin, W. Harder & K.-H. Schleifer. New York: Springer- Verlag. 

Bond, P.L., Hugenholtz, P., Keller, J., and Blackall, L.L. (1995). Bacterial community structures of 
phosphate-removing and non-phosphate-removing activated sludges from sequencing batch reactors. 

15 Applied and Environmental Microbiology, 61, 1910-1916. 

Burrell, P., and Blackall, L.L. (in press). The microbiology of nitrogen removal in activated sludge 
systems. In The Microbiology of Activated Sludge, Edited by R. J. Seviour & L. L. Blackall. 
Ehrich, S., Behrens, D., Lebedeva, E., Ludwig, W. and Bock, E. (1995). A new obligately 
chemolithoautotrophic, nitrite-oxidizing bacterium, Nitrospira moscoviensis sp. nov. and its 

20 phylogenetic relationship. Archives of Microbiology, 164, 16-23. 

Halling-Serensen, B., and Jorgensen, S.E. (1993). The Removal of Nitrogen Compounds from 
Wastewater, Amsterdam: Elsevier. 

Hovanec, T. A. & DeLong, E. F. (1996). Comparative Analysis of Nitrifying Bacteria Associated 
with Freshwater and Marine Aquaria. Applied and Environmental Microbiology, 62, 2888-2896. 
Meganck, M.T.J. , and Faup, G.M. (1988). Enhanced biological phosphorus removal from waste 
waters. Biotreatment Systems, 3, 111-204. 

Randall, C. W. (1992). Introduction and Principles of Biological Nutrient Removal. In Design and 
Retrofit of Wastewater Treatment Plants for Biological Nutrient Removal, (pp. 7-84). Edited by C. 
W. Randall. Lancaster: Technomic Publishing Company Inc. 

Robertson, L. A. & Kuenen, J. G. (1991). Physiology of Nitrifying and Denitrifying Bacteria. In 
Microbial Production and Consumption of Greenhouse Gases: Methane. Nitrogen Oxides and 
Halomethanes, (pp. 189-199). Edited by J. E. Rogers & W. B. Whitman. Washington DC.: 
American Society for Microbiology. 

Sambrook, J., Fritsch, E. F., and Maniatis, T., Molecular Cloning: A Laboratory Manual. Second 
Edition, Cold Spring Harbor Laboratory Press, Plainview, New York, 1989. 



25 



30 



35 



BNSOOCID: <CA 22S2064A1 J_, 



CA 02252064 1998-1 1-20 



21 

Stackebrandt, E., and Goodfellow, M., eds, Nucleic Acid Techniques in Bacterial Systematic*-, John 
Wiley & Sons, New York, 1991 

Wagner, M., Rath, G., Koops, H.-P., Flood, J., and Amann, R. (1996). In situ analysis of nitrifying 
bacteria in sewage treatment plants. Water Science and Technology. 34, 237-244. 
5 Weidner, S., Arnold, W. and Punier, A. (1996). Diversity of Uncultured Microorganisms Associated 
with the seagrass Halophila stipulacea Estimated by Restriction Fragment Length Polymorphism 
Analysis of PCR-Amplified 16S rRNA Genes. Applied and Environmental Microbiology, 62, 766- 
771. 



<CA 2252064A1_I_> 



CA 02252064 1998-11-20 



22 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: CRC for Waste Managment and Pollution Control 

Limited 

(B) STREET: High Street 
10 (C) CITY: Kensington 

(D) STATE: New South Wales 

(E) COUNTRY: Australia 

(F) POSTAL CODE (ZIP) : 2033 



15 



20 



25 



35 



(ii) TITLE OF INVENTION: Aquatic Nitrite Oxidising Microorganisms 
(iii) NUMBER OF SEQUENCES: 59 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 
(2) INFORMATION FOR SEQ ID NO : 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1428 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

40 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CAAGTCGAGC GAGAAGACGT AGCAATACGT TTGTAAAGCG GCGAACGGGT GAGGAATACA 

^ TGGGTAACCT ACCTTCGAGT GGGGAATAAC TAGCCGAAAG GTTAGCTAAT ACCGCATACG 

ACTCCTGGTC TGCGGATCGG GAGAGAAAGC GATACCGTGG GTATCGCGCT CTTGGATGGG 

CTCATGTCCT ATCAGCTTGT TGGTGAGGTA ACGGCTCACC AAGGCTTCGA CGGGTAGCTG 24 0 

55 GTCTGAGAGG ACGATCAGCC ACACTGGCAC TGCGACACGG GCCAGACTCC TACGGGAGGC 



60 
120 
180 



300 



AG CAGT AAGG AATATTGCGC AATGGGCGAC AGCCTGACGC AGCNACGCCG CGTGGGGGAT 360 



BNSDOCIO <CA 2252064A1_I_> 



CA 02252064 1998- I 1-20 



23 

GAAGGTCTTC GGATTGTAAA CCCCTTTCGG CAGGGAAGAT GGAACGGGTA ACCGTTCGGA 
CGGTACCTGC AGAAGC AG CC ACGGCTAACT TCGTGCCAGC AGCCGCGGTA ATACGAAGGT 
GGCAAGCGTT GTTCGGATTT ACTGGGCGTA CAGGGAGCGT AGGCGGTTGG GTAAGCCCTC 
CGTGAAATCT CCGGGCCTAA CCCGGAAAGT GCGGAGGGGA CTGCTCGGCT AGAGGATGGG 
AGAGGAGCGC GGAATTCCCG GTGTAGCGGT GAAATGCGTA GAGATCGGGA GGAAGGCCGG 
TGGCGAAGGC GGCGCTCTGG AACATTTCTG ACGCTGAGGC TCGAAAGCGT GGGG AG C AAA 
CAGGATTAGA TACCCTGGTA GTCCACGCCT TAAACGATGG ATACTAAGTG TCGGCGGGTT 
ACCGCCGGTG CCGCAGCTAA CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT 
GAAACTCAAA GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
GCAACGCGAA .GAACCTTACC CAGGCTGGAC ATGCAGGTAG TAGAAGGGTG AAAGCCTAAC 
GAGGTAGCAA TACCATCCTG CTCAGGTGCT GCATGGCTGT CGTCAGCTCG TGCCGTGAGG 
TGTTGGGTTA AGTCCCGCAA CGAGCGCAAC CCCTGTGTTC AGTTACCAAC GGGTCATGCC 
GGGAACTCTG GAGAGACTGC CCAGGAGAAC GGGGAGGAAG GTGGGGATGA CGTCAAGTCA 
GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA AGCGCTGCAA 
ACCCGTAAGG GGGAGCCAAT CCCAAAAAAC CGGCCTCAGT TCAGATTGAG GTCTGCAACT 
CGACCTCATG AAGGCGGAAT CGCTAGTAAT CCCGGATCAG CACGCCGGGG TGAATACGTN 
CCCGGGCCTT GTACACACCG CCCGTCACAC CACGAAAGTT TGTTGTACCT GAAGTCGTTG 
GCGCCAACCG CAAGGAGGCA GACGCCCACG GTATGACCGA TGATTGGG 
(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 140 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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15 



20 



25 



30 



35 



40 



45 



360 
420 



540 
600 
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TAATACATGC AAGTCGAGCG AGAAGACGTA GCAATACGTT TGTAAAGCGG CGAACGGGTG 6 0 

AGGAATACAT GGGTAGCCTA CCCTCGAGTG GGGAATAACT AACCGAAAGG TTAGCTAATA 12 0 

CCGCATACGG CTCCTGGTCT GCGGATCGGG AGAGAAAGCG ATACCGTGGG TATCGCGCTC 180 
TTGGATGGGC TCATGTCCTA TCAGCTTGTT GGTGAGGTAA CGGCTCACCA AGGCTTCGAC 24 0 

GGGTAGCTGG TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT 3 00 

ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGACA GCCTGACGCA GCNACGCCGC 
GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGC AGGGAAGATG GAACGGGTAA 
CCGTTCGGAC GGTACCTGCA GAAGCAGCCA CGGCTAACTT CGTGCCAGCA GCCGCGGTAA 48 0 

TACGAAGGTG GCAAGCGTTG TTCGGATTTA CTGGGCGTAC AGGGAGCGTA GGCGGTTGGG 
TAAGCCCTCC GTGAAATCTC CGGGCCTAAC CCGGAAAGTG CGGAGGGGAC TGCTCGGCTA 
GAGGATGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 66 0 

GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATTTCTGA CGCTGAGGCT CGAAAGCGTG 72 0 

GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCTT AAACGATGGA TACTAAGTGT 78 0 

CGGCGGGTTA CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC 
CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 
TAATTCGACG CAACGCGAAG AACCTTACCC AGGCTGGACA TGCAGGTAGT AGAAGGGTGA 
AAGC CTAACG AGGTAGCAAT ACCATCCTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 
GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGTCTTCA GTTACCAACG 108 0 

GGTCATGCCG GGAACTCTGG AGAGACTGC C CAGGAGAACG GGGGAGGAAG GTGGGGATGA 114 0 

CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 120 0 
AGCGCTGCAA ACCCGTAAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG 
GTCTGCAACT CGACCTCATG AAGGCGGAAT CGCTAGTAAT CCCGGATCAG CACGC CGGGG 
TGAATACGTN CCCGGACCTT GTACACACCG CCCGTCACAC CACGAAAGTT TGTTGTACCT 
GAAGTCGTTG GCGCCAACCG CAAGGAG 
50 (2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 



840 
900 
96 0 
1020 



1260 
1320 
1380 
1407 
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15 



25 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nit'rospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TTGATCCTGG CTCAGAACGA ACGCTGGCGG CGCGCCTAAT ACATGCAAGT CGAGCGAGAA 60 

GACGTAGCAA TACGTTTGTA AAGCGGCGAA CGGGTGAGGA ATACATGGGT AACCTACCCT 12 0 

CGAGTGGGGA ATAACTAGCC GAAAGGTTAG CTAATACCGC ATACGACTCC TGGTCTGCGG 180 

20 ATCGGGAGAG AAAG CG AT AC CGTGGGTATC GCGCTCTTGG ATGGGCTCAT GTCCTATCAG 24 0 

CTTGTTGGTG AGGTAACGGC TCACCAAGGC TTCGACGGGT AGCTGGTCTG AGAGGACGAT 3 00 

CAGCCACACT GGCACTGCGA CACGGGCCAG ACTCCTACGG GAGGCAGCAG TAAGGAATAT 360 

25 

TGCGCAATGG GCGACAGCCT GACGCAGCNA CGCCGCGTGG GGGATGAAGG TCTTCGGATT 42 0 

GTAAACCCCT TTCGGCAGGG AAGATGGAAC GGGTAACCGT TCGGACGGTA CCTG CAGAAG 48 0 

30 CAGCCACGGC TAACTTCGTG CCAGCAGCCG CGGTAATACG AAGGTGG CAA GCGTTGTTCG 54 0 

GATTTACTGG GCGTACAGGG AGCGTAGGCG GTTGGGTAAG CCCTCCGTGA AATCTCCGGG 600 

CCTAACCCGG AAAGTGCGGA GGGGACTGCT CGGCTAGAGG ATGGGAGAGG AGCGCGGAAT 66 0 

35 

TCCCGGTGTA GCGGTGAAAT GCGTAGAGAT CGGGAGGAAG GCCGGTGGCG AAGGCGGCGC 72 0 

TCTGGAACAT TTCTGACGCT GAGGCTCGAA AGCGTGGGGA GCAAACAGGA TT AG AT AC C C 780 

40 TGGTAGTCCA CGCCTTAAAC GATGGATACT AAGTGTCGGC GGGTTACCGC CGGTGCCGCA 84 0 

GCTAACGCAT TAAGTATCCC GCCTGGGAAG TACGGCCGCA AGGTTGAAAC TCAAAGGAAT 900 

TGACGGGGGC CCGCACAAGC GGTGGAGCAT GTGGTTTAAT TCGACGCAAC GCGAAGAACC 96 0 

45 

TTACCCAGGC TGGACATGCA GGTAGTAGAA GGGTGAAAGC CTAACGAGGT AGCAACAC C A 102 0 

TCCTGCTCAG GTGCTGCATG GCTGTCGTCA GCTCGTGCCG TGAGGTGTTG GGTTAAGTCC 108 0 

50 CGCAACGAGC GCAACCCCTG TCTTCAGTTA CCAACGGGTC ATGCCGGGAA CTCTGGAGAG 114 0 

ACTGCCCAGG AGAACGGGGA GGAAGGTGGG GATGACGTCA AGTCAG CATG GCCTTTATGC 12 0 0 

CTGGGGCCAC ACACGTGCTA CAATGGCCGG TACAAAG CGC TGCAAACCCG TAAGGGGGAG 12 6 0 

CCAATCGCAA AAAACCGGCC TCAGTTCAGA TTGAGGTCTG CAACTCGACC TCATGAAGGC 132 0 



55 
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GGAATCGCTA GTAATC CCGG ATCAGCACGC CGGGGTGAAT ACGTNCCCGG GCCTTGTACA 
CACCGCCCGT CACACCACGA AAGTTTGTTG TACCTGAAGT CGTTGGCGCC AACCGCAAGG 
GGGCAGACGC CCACGGTATG ACCGATGATT GGGGTGAAGT CGTAACAAGG TAACCGTAAC 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1420 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CGAGAAGACG TAGCAATACG TTTGTAAAGC GGCGAACGGG TGAGGAATAC ATGGGTAACC 
TACCCTCGAG TGGGGAATAA CTAACCGAAA GGTTAGCTAA TACCGCATAC GGCTCCTGGT 
CTGCGGATCG GGAGAGAAAG CGATACCGTG GGTATCGCGC TCTTGGATGG GCTCATGTCC 
TATCAGCTTG TTGGTGAGGT AACGGCTCAC CAAGGCTTCG ACGGGTAGCT GGTCTGAGAG 
GACGATCAGC CACACTGGCA CTGCGACACG GGCCAGACTC CTACGGGAGG CAGCAGTAAG 
GAATATTGCG CAATGGGCGA CAGCCTGACG CAGCGACGCC GCGTTGGGGA TGAAAGTCTT 
CCGATTGTAA ACCCCTTTCC GCAGGGAAGA TGGAACGGGT AACCGTTCGG ACGGTACCTG 
CAGAAG CAGC CACGGCTAAC TTCGTGCCAG CAGCCGCGGT AATACGAAGG TGGCAAGCGT 
TGTTCGGATT TACTGGGCGT ACAGGGAGCG TAGGCGGTTG GGTAAGCCCT CCGTGAAATC 
TCCGGGCCTA ACCCGGAAAG TGCGGAGGGG ACTGCTCGGC TAGAGGATGG GAGAGGAGCG 
CGGAATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG AGGAAGGCCG GTGGCGAAGG 
CGGCGCTCTG GAACATTTCT GACGCTGAGG CTCGAAAGCG TGGGGAGCAA ACAGGATTAG 
ATACCCTGGT AGTCCACGCC TTAAACGATG GATACTAAGT GTCGGCGGGT TACCGCCGGT 
GCCGCAGCTA ACG CATTAAG TATCCCGCCT GGGAAGTACG GCCGCAAGGT TGAAACTCAA 
AGGAATTGAC GGGGCCCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA CGCAACGCGA 
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AGAACCTTAC CCAGG CAGGA CATGCAGGTA GTAGAAGGGT GAAAGCCTAA CGAGGTAGCA 
ATACCATCCT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC GTGCCGTGAG GTGTTGGGTT 
AAGTCCCGCA ACGAGCGCAA CCCCTGTCTT CAGTTACCAA CGGGTCATGC CGGGAACTCT 
GG AG AG AC TG CCCAGGAGAA CGGGGAGGAA GGTGGGGATG ACGTCAAGTC AGCATGGCCT 
TTATGCCTGG GGCCACACAC GTGCTACAAT GGCCGGTACA AAGCGCTGCA AACCCGTAAG 
GGGGAGC CAA TCGCAAAAAA CCGGCCTCAG TTCAGATTGA GGTCTGCAAC TCGACCTCAT 
GAAGGCGGAA TCGCTAGTAA TCCCGGATCA GCACGCCGGG GTGAATACGT NCCCGGGCCT 
TGTACACACC GCCCGTCACA CCACGAAAGT TTGTTGTACC TGAAGTCGTT GGCGCCAACC 
GCAAGGAGGC AGACGCCCAC GGTATGACCG ATGATTGGGG 
(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1505 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 
AGAGTTTGAT CCTGGCTCAG AACGAACGCT GGCGGCGCGC CTAATACATG CAAGTCGAGC 
GAGAAGACGT AGCAATACGT TTGTAAAGCG GCGAACGGGT GAGGAATACA TGGGTAATCT 
ACCATCGAGT GGGGAATAAC CAACCGAAAG GTTGGCTAAT ACCGCGTACG CTTCTGAGTC 
TTCGGGTTCG GAAGGAAAGC CGTACTGTGA GTGCGGCGCT CTTTGATGAG CTCATGTCCT 
ATCAGCTTGT TGGTAGGGTA ACGGCCTACC AAGGCTTTGA CGGGTAGCTG GTCTGAGAGG 
ACGATCAGCC ACACTGGCAC TGCGACACGG GCCAGACTCC TACGGGAGGC AGCAGTAAGG 
AATATTGCGC AATGGG CGAA AGCCTGACGC AGCNACGCCG CGTGGGGGAT GAAG3TCTTC 
GGATTGTAAA CCCCTTTCGG GAGGGAAGAT GGAGCGAGCA ATCGTTCGGA CGGTACCTCC 
AGAAG CAGCC ACGGCCAACT TCGTGCCAGC AGCCGCGGTA ATACGAAGGT GGCAAGCGTT 
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GTTCGGATTC ACTGGGCGTA CAGGGTGTGT AGGCGGTTTG GTAAGCCTTC TGTTAAAGCT 600 

TCGGGCCCAA CCCGGAAAGC GCAGACGGTA CTGCCAGGCT AGAGGGTGGG AGAGGAGCGC 660 

5 

GGAATTCCCG GTGTAGCGGT GAAATGCGTA GAGATCGGGA GGAAGGCCGG TGGCGAAGGC 72 0 

GGCGCTCTGG AACATACCTG ACG CTGAGAC ACGAAAGCGT GGGGAGCAAA CAGGATTAGA 78 0 

10 TACCCTGGTA GTCCACGCCC TAAACTATGG ATACTAAGTG TCGGCGGGTT ACCGCCGGTG 84 0 

CCGCAGCTAA CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 900 

GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC GCAACGCGAA 96 0 

15 

GAACCTTACC CAGGTTGGAC ATGCACGTAG TAGAAAGGTG AAAGCCTGAC GAGGTAGCAA 102 0 

TACCAGCGTG CTCAGGTGCT GCATGGCTGT CGTCAGCTCG TGCCGTGAGG TGTTGGGTTA 108 0 

20 AGTCCCGCAA CGAGCG CAAC CCCTGCTTTC AGTTGCTACC GGGTCATGCC GAGCACTCTG 114 Q 

AAAGGACTGC CCAGGATAAC GGGGAGGAAG GTGGGGATGA CGTCAAGTCA GCATGGCCTT 12 00 

TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA AGCGCTGCAA ACCCGTGAGG 126 0 

25 

GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG GTCTGCAACT CGACCTCATG 132 0 

AAGGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG TGAATACGTN CCCGGGCCTT 13 8 0 

30 GTACACACCG CCCGTCACAC CACGAAAGCC TGTTGTACCT GAAGTCGCCC AAGCCAACCG 144 0 

CAAGGAGGCA GGCGCCCACG GTATGGCCCG TGATTGGGGT GAAGTCGTAA CAAGGTAACC 150 0 
GTAAA 

(2) INFORMATION FOR SEQ ID NO: 6: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1441 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 



45 



1505 



(ii) MOLECULE TYPE: DNA (genomic) 

( iii ) HYPOTHETICAL : NO 

(iv) ANTI- SENSE: NO 

50 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AAGTCGAGCG AGAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG AGGAATACAT 6 0 
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GGGTAATCTA CCATCGAGTG GGGAATAACC AG CCGAAAGG TTGGCTAATA CCGCGTACGC 12 0 

TTCCGAGTCT TCGGGCTTGG AAGGAAAGCC GCACTGTGAG TGCGGCGCTC TTTGATGAGC 180 

5 

TCATGTCCTA TCAGCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC GGGTAGCTGG 2 40 

TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT ACGGGAGGCA 3 00 

10 GCAGTAAGGA ATATTGCGCA ATGGGCGAAA GCCTGACGCA GCGACGCCGC GTGGGGGATG 3 60 

AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAGCCAGCAA TCGTTCGGAC 4 20 

GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA TACGAAGGTG 4 80 

15 

GCAAG CGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA NGCGGTTTGG TAAGCCTTCT 54 0 

GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG CAGAGGGTAC TGCCAGGCTA GAGGGTGGGA 6 00 

20 GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG GAAGGCCGGT 660 

GGCGAAGGCG GCGCTCTGGA ACATGCCTGA CGCTGAGACA CGAAAGCGTG GGGAGCAAAC 720 

AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAACTATGGA TACTAAGTGT CGGCGGGTTA - 78 0 

25 

CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC CGCAAGGTTG 84 0 

AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT TAATTCGACG 900 

30 CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA AAGNCTAACG 960 

AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT GCCGTGAGGT 1020 

GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTACCG GGTCATGCCG 1080 

35 

AGCACTCTGA AAGGACTGCC CAGGATAACG GGGAGGAAGG TGGGGATGAC GTCAAGTCAG 114 0 

CATGGCCTTT ATGCCTGGGG CCACACACGT GCTACAATGG CCGGTACAAA GCGCTGCAAA 12 0 0 

40 CCCGTGAGGG GGAGC CAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG TCTGCAACTC 12 6 0 

GACCTCATGA AGGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT GAATACGTNC 132 0 

CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG AAGTCGCCCA 13 8 0 

45 

AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATTGGGGTG AAGTCCTAAC 14 4 0 

A 1441 
50 (2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1426 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 

(D ) TOPOLOGY : 1 inear 
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30 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

TAATACATGC AAGTCGAGCG AGAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG 6 0 

AGGAATACAT GGGTAATCTA CCATCGAGTG GGGAATAACC AACCGAAAGG TTGGCTAATA 12 0 

CCGCGTACGC TTCTGAGCCT TCGTGTTCGG AAGGAAAGCC GTACTGTGAG TGCGGCGCTC 180 

20 TTTGATGAGC TCATGTCCTA TCAGCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC 240 

GGGTAGCTGG TCTGAGAGGA CGATCAGC C A CACTGGCACT GCGACACGGG CCAGACTCCT 3 00 

ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGAAA GCCTGACGCA GCNACGCCGC 360 

25 

GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAG CG AG CAA 420 

TCGTT CGG AC GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA 480 

30 TACGAAGGTG GCAAGCGTTG CTTGGATTCA CTGGG CGTAC AGGGTGTGTA GGCGGTTTGG 54 0 

TAAGC CTTCT GTTAAAGCTT CGGGCCCAAC CCGAAAAGCG CAGAGGGTAC TGCCAGGCTA 60 0 

^ GAGGGTGGGA GAGGAGCGCG GAATTCC CGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 660 

GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATACCTGA CG CTGAGAC A CGAAAACGTG 72 0 

GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAACTATGGA TACTAAGTGT 780 

40 CGGCGGGTTA CCGCCGGTGC CG CAGCTAAC GCATTAAGTA TCCCGCCTGG GAGGTACGGC 84 0 

CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCTTGTGGTT 90 0 

TAATTCGACG CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA 960 

45 

AAGCCTGACG AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 102 0 

GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAAC C CCTGCTTTCA GTTGCTACCG 108 0 

50 GGTCATGCCG AGCACTCTGA AAGGACTGCC CAGGATAACG GGGAGGAAGG TGGGGATGAC 114 0 

GTCAAGTCAG C ATGG CCTTT ATGCCTGGGG CCACACACGT GCTACAATGG C CGGTAC AAA 12 00 

GCGCTGCAAA CCCGTGAGGG GGAGCCAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG 12 6 0 

TCTGCAACTC GACCTCATGA AGGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT 132 0 
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GAATACGTNC CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG 13 80 

AAGTCGCCCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGC 1426 
5 (2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 9 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 
20 (A) ORGANISM: Nitrospira 



25 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

TAAT AC ATG C AAGTCGAGCG AGAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG 60 

AGGAATACAT GGGTAATCTA CCATCGAGTG GGGAATAACC AACCGAAAGG TTGGCTAATA 12 0 

30 CCGCGTACGC CTCCGAGTCT TCGGGTTCGG AGGGAAAGCT GCACTGTGAG TGTAGCGCTC 180 

TTTGATGAGC TCATGTCCTA TCAGCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC 24 0 

GGGTAGCTGG TCTGAGAGGA CGATCAGC CA CACTGGCACT GCGACACGGG CCAGACTCCT 3 00 

35 

ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGAAA GCCTGACGCA GCNACGCCGC 3 60 

GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAGCGAGCAA 42 0 

40 TCGTTCGGAC GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA 4 80 

TACGAAGGTG GCAAGCGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA GGCGGTTTGG 54 0 

TAAGCCTTCT GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG CAGGGGGTAC TGCCAGGCTA 600 

45 

GAGGGTGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 660 

GAAGGC CGGT GGCGAAGGCG GCGCTCTGGA ACATACCTGA CGCTGAGACA CGAAAGCGTG 72 0 

50 GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAGCTATGGA TACTAAGTGT 78 0 

CGGCGGGTTA CCGCCGGTGC CGCAGCCAAC GCGTTAAGTA TCCCGCCTGG GAAGTACGGC 84 0 

CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 90 0 

TAATTCGACG CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA 960 
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AAGCCTGACG AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 

GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTACCG 

GGTCATGCCG AGCACTCTGA AAGGACTGCC CAGGATAACG GGGGAGGAAG GTGGGGATGA 

CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 

AACGCTGCAA ACCCGTGAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG 

GTCTGCAACT CGACCTCATG AAGGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG 

TGAATACGTN CCCGGGCCTT GTGCACACCG CCCGTCACAC CACGAAAGCC TGTTGTACCT 
GAAGTCGCCC AAGCCAACCG CAAGGAGGCA GGCGCCCACG GTATGGCCG 
(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGAGAAGGTG TAGCAATACA CTTGTAAAGC GGCGAACGGG TGAGGAATAC ATGGGTAATC 
TACCATCGAG TGGGGAATAA CCAACCGAAA GGTTGGCTAA TACCGCGTAC GCCTCCGAGT 
CTTCGGGTTC GGAGGGAAAG CTGCACTGTG AGTGTAGCGC TCTTTGATGA GCTCATGTCC 
TATCAGCTTG TTGGTAGGGT AACGGCCTAC CAAGGCTTTG ACGGGTAGCT GGTCTGAGAG 
GACGATCAGC CACACTGGCA CTGCGACACG GGCCAGACTC CTACGGGAGG CAG CAGTAAG 
GAATATTGCG CAATGGGCGA AAGCCTGACG CAGCNACGCC GCGTGGGGGA TGAAGGTCTT 
CGGATTGTAA ACCCCTTTCG GGAGGGAAGA TGGAGCGAGC AATCGTTCGG ACGGTACCTC 
CAGAAGCAGC CACGGCCAAC TTCGTGCCAG CAGCCGCGGT AATACGAAGG TGGCAAGCGT 
TGTTCGGATT CACTGGGCGT ACAGGGTGTG TAGGCGGTTT GGTAAGCCTT CTGTTAAAGC 
TTCGGGCCCA ACCCGGAAAG CGCAGAGGGT ACTGCCAGGC TAGAGGGTGG GAGAGGAGCG 
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CGGAATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG AGGAAGGCCG GTGGCGAAGG 

CGGCGCTCTG GAACATACCT GACGCTGAGA CACGAAAGCG TGGGGAGCAA ACAGGATTAG 

ATACCCTGGT AGTCCACGCC CTAAACTATG GATACTAAGT GTCGGCGGGT TACCGCCGGT 

GCCGCAGCTA ACGCATTAAG TATCCCGCCT GGGAAGTACG GCCGCAAGGT TGAAACTCAA 

AGGAATTGAC GGGGGCCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA CGCAACGCGA 

AGAACCTTAC CCAGGTTGGA CATGCACGTA GTAGAAAGGT GAAAGCCTGA CGAGGTAGCA 

ATACCAGCGT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC GTGCCGTGAG GTGTTGGGTT 

AAGTC CCGCA ACGAGCGCAA CCCCTGCTTT CAGTTGCTAC CGGGTCATGC CGAGCACTCT 

GAAAGGACTG CCCAGGATAA CGGGGAGGAA GGTGGGGATG ACGTCAAGTC AGCATGGCCT 

TTATG C CTGG GGCCACACAC GTGCTACAAT GGCCGGTATA AAACGCTGCA AACCCGTGAG 

GGGGAGCCAA TCGCAAAAAA CCGGCCTCAG TTCAGATTGA GGTCTGCAAC TCGACCTCAT 

GAAGGCGGAA TCGCTAGTAA TCGCGGATCA GCACGCCGCG GTGAATACGT NCCCGGGCCT 

TGTACACACC GCCCGTCACA CCACGAAAGC CTGTTGTACC TGAAGTCGCC CAAGCCAACC 

GCAAGGAGGC AGGCGCCCAC GGTATGGCCG GTGAT 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1435 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCTAATACAT GCAAGTCGAT CGAGAAGGTG TAGCAATACA CTTGTAAAGC GGCGAACGGG 
TGAGGAATAC ATGGGTAATC TACCATCGAG TGGGGAATAA CCAACCGAAA GGTTGGCTAA 
TACCGCGTAC GCCTCCGAGT CTTCGGGTTC GGAGGGAAAG CTGCACTGTG AGTGTAGCGC 
TCTTTGATGA GCTCATGTCC TATCAGCTTG TTGGTAGGGT AACGGCCTAC CAAGGCTTTG 
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ACGGGTAGCT GGTCTGAGAG GACGATCAGC C AC ACTGG C A CTGCGACACG GGCCAGACTC 
CTACGGGAGG CAGCAGTAAG GAATATTGCG CAATGGGCGA AAGCCTGACG CAGCCACGCC 
5 GCGTGGGGGA TGAAGGTCTT CGGATTGTAA ACCCCTTTCG GGAGGGAAGA TGGAGCGAGC 
AATCGTTCGG ACGGTACCTC CAGAAG CAGC CACGGCCAAC TTCGTGCCAG CAGCCGCGGT 
AATACGAAGG TGGCAAGCGT TGTTCGGATT CACTGGGCGT ACAGGGTGTG TAGGCGGTTT 



GGTAAGCCTT CTGTTAAAGC TTCGGGCCCA ACCCGGAAAG CGCAGAGGGT ACTGCCAGGC 
TAGAGGGTGG GAGAGGAGCG CGG AATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG 
15 AGGAAGGCCG GTGGCGAAGG CGGCG CTCTG GAACATACCT GACG CTGAG A CACGAAAGCG 
TGGGGAGCAA ACAGGATTAG ATACCCTGGT AGTCCACGCC CTAAACTATG GATACTAAGT 
GTCGGCGGGT TACCGCCGGT GCCGCAGCTA ACGCATTAAG TATCCCGCCT GGGAAGTACG 



GCCGCAAGGT TGAAACTCAA AGGAATTGAC GGGGGCCCGC ACAAGCGGTG GAGCATGTGG 
TTTAATTCGA CGCAACGCGA AGAAC CTTAC CCAGGTTGGA CATGCACGTA GTAGAAAGGT 
25 GAAAG CCTGA CGAGGTAGCA ATACCAGCGT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC 
GTGCCGTGAG GTGTTGGGTT AAGTCCCGCA ACGAGCGCAA CCCCTGCTTT CAGTTGCTAC 
CGGGTCATGC CGAGCACTCT GAAAGGACTG CCCAGGATAA CGGGGAAGGA AGGTGGGGAT 



GACGTCAAGT C AG CATGGCC TTTATGCCTG GGGC CACACA CGTGCTACAA TGGCCGGTAC 

AAAACGCTGC AAACC CGTG A GGGGGAGCCA ATCGCAAAAA ACCGGCCTCA GTTCAGATTG 

35 AGGTCTGCAA CTCGACCTCA TGAAGGCGGA ATCGCTAGTA ATCGCGGATC AGCACGCCGC 

GGTGAATACG TNCCCGGGCC TTGTACACAC CGCCCGTCAC AC CACGAAAG CCTGTTGTAC 

CTGAAGTCGC CCAAGCCAAC CGCAAGAAGG CAGGCGCCCA CGGTATGGCC GGTGA 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1435 



40 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) 



SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1437 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



45 



(ii) 



MOLECULE TYPE: DNA (genomic) 



50 



(iii) 



HYPOTHETICAL: NO 



(iv) 



ANTI- SENSE: NO 



55 



(vi) 



ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 
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10 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AATACATGCA AGTCGATCGA GAAGGTGTAG CAATACACTT GTAAAGCGGC GAACGGGTGA 6 0 

GGAATACATG GGTAATCTAC CATCGAGTGG GGAATAACCA ACCGAAAGGT TGGCTAATAC 12 0 

CGCGTACGCC TCCGAGTCTT CGGGTTCGGA GGGAAAGCTG CACTGTGAGT GTAGCGCTCT 18 0 

TTGATGAGCT CATGTCCTAT CAGCTTGTTG GTAGGGTAAC GGCCTACCAA GGCTTTGACG 24 0 

GGTAGCTGGT CTGAGAGGAC GATCAGCCAC ACTGGCACTG CGACACGGGC CAGACTCCTA 30 0 

15 CGGGAGGCAG CAGTAAGGAA TATTGCGCAA TGGGCGAAAG CCTGACGCAG CCACGCCGCG 36 0 

TGGGGGATGA AGGTCTTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG AGCGAGCAAT 42 0 

CGTT CGGACG GTACCTCCAG AAGCAGCCAC GGCCAACTTC GTGCCAGCAG CCGCGGTAAT 48 0 

20 

ACGAAGGTGG CAAGCGTTGT TCGGATTCAC TGGGCGTACA GGGTGTGTAG GCGGTTTGGT 54 0 

AAGCCTTCTG TTAAAGCTTC GGGCCCAACC CGGAAAGCGC AGAGGGTACT GCCAGGCTAG .* 60 0 

25 AGGGTGGGAG AGGAGCGCGG AATTCCCGGT GTAG CGGTGA AATGCGTAGA GATCGGGAGG 66 0 

AAGGC CGGTG GCGAAGGCGG CGCTCTGGAA CATACCTGAC GCTGAGACAC GAAAGCGTGG 72 0 
GGAGCAAACA GGATTAGATA CCCTGGTAGT CCACGCCCTA AACTATGGAT ACTAAGTGTC . 78 0 

30 

GGCGGGTTAC CGCCGGTGCC GCAGCTAACG CATTAAGTAT CCCGCCTGGG AAGTACGGCC 84 0 

GCAAGGTTGA AACTCAAAGG AATTGACGGG GGCCCGCACA AGCGGTGGAG CATGTGGTTT 90 0 

35 AATTCGACGC AACGCGAAGA ACCTTACCCA GGTTGGACAT GCACGTAGTA NAAAGGTGAA 960 

AGCCTGACGA GGTAGCAATA CCAGCGTGCT CAGGTGCTGC ATGGCTGTCT TCAGCTCGTG 102 0 

CCGTGAGGTG TTGGGTTAAG TCCCGCAACG AGCGCAACCC CTGCTTTCAG TTGCTACCGG 10 8 0 

40 

GTCATGCCGA ACACTCTGAA AGGACTGCCC AGGATAACGG GGAAGGAAGG TGGGGATGAC 114 0 

GTCAAGTCAG CATGGCCTTT ATGCCTGGGG CCACACACGT GCTACAATGG CCGGTACAAA 12 0 0 

45 GCGCTGCAAA CCCGTGAGGG GGAGCCAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG 126 0 

TCTGCAACTC GACCTCATGA AGGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT 132 0 

GAATACGTNC CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG 13 8 0 

AAGTCGCCCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATGGGG 143 7 
(2) INFORMATION FOR SEQ ID NO: 12: 



50 



55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 37 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

10 (vi) ORIGINAL SOURCE; 

(A) ORGANISM: Nitrospira 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AATACATGCA AGTCGATCGA NAAGGTGTAG CAATACACTT GTAAAGCGGC GAACGGGTGA 6 0 

GGAATACATG GGTAATCTAC CATCGAGTGG GGAATAACCA ACCGAAAGGT TGG CTAATAC 12 0 

20 

CGCGTACGCC TCCGAGTCTT CGGGTTCGGA GGGAAAGCTG CACTGTGAGT GTAGCGCTCT 180 

TTGATGAGCT CATGTC CTAT CAGCTTGTTG GTAGGGTAAC GGCCTACCAA GGCTTTGACG 24 0 

25 GGTATCTGGT CTGAGAGGAC GATCAGCCAC ACTGGCACTG CGACACGGGC CAGACTCCTA 3 00 

CGGGAGGCAG CAGTAAGGAA TATTGCGCAA TGGGCGAAAC CCNGACGCAG CCACGCCGCG 360 

TGGGGGATGA AGGTCTTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG AACGAGCAAT 42 0 

30 

CGTTCGGACG GTACCTCCAG AAGCAGCCAC GGCCAACTTC GTGCCAGCAG CCGCGGTAAT 4 80 

ACGAAGGTGG CAAGCGTTGT TCGGATTCAC TGGGCGTACA GGGTGTGTAG GCGGTTTGGT 540 

35 AAGCCTTCTG TTAAAGCTTC GGGCCCAACC CGGAAAGCGC AGAGGGTACT GCCAGGCTAG 600 

AGGGTGGGAG AGGAGCGCGG AATTCCCGGT GTAG CGGTGA AATGCGTAGA GATCGGGAGG 660 

AAGGCCGGTG GCGAAGGCGG CGCTCTGGAA CATACCTGAC GCTGAGACAC GAAAGCGTGG 720 

40 

GGNGCAAACA GGATTAGATA CCCTGGTAGT CCACGCCCTA AACTATGGAT ACTAAGTGTC 780 

GGCGGGTTAC CGCCGGTGCC GCAGCTAACG CATTAAGTAT CCCGCCTGGG AAGTACGGCC 84 0 

45 GCAAGGTTGA AACTCAAAGG GATTGACGGG GGCCCGCACA AGCGGTGGGG CATGTGGTTT 90 0 

AATTCGACGC AACGCGAAGA ACCTTACCCA GGTTGGACAT GCACGTAGTN GAAAGGTGAA 960 

AGCCTGACGA GGTAG CAATA CCAGCGTGCT CAGGTGCTGC ATGGCTGTCG TCAGCTCGTG 102 0 

50 

CCGTGAGGTG TTGGGTTAAG TCCCGCAACG AGCGCAACCC CTGCTTTCAG TTGCTACCGG 108 0 

GTCATGCCGA ACACTCTGAA AGGACTGCCC AGGATAACGG GGAAGGAAGG TGGGGATGAC 114 0 

55 GTCAAGTCAG CATGGC CTTT ATACCTGGGG CCACACACGT GCTACAATGG CCGGTACAAA 1200 

ACGCTGCAAA CCCGTGAGGG GGAGCGAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG 12 60 
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TCTGCAACTC GACCTCATGA ATGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT 132 0 

GAATACGTNC CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG 13 8 0 

5 

AAGTCGCCCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATGGGG 14 3 7 
(2) INFORMATION FOR SEQ ID NO: 13: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1435 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TAATACATGC AAGTCGATCG ANAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG 6 0 

30 

AGGAATACAT GGGTAATCTA CCATCGAGTG GGGAATAACC AACCGAAAGG TTGGCTAATA 12 0 

CCGCGTACGC TTCCGAGTCT TCGGGC TTGG AAGGAAAGCC GCACTGTGAG TGCGGCGCTC 18 0 

35 TTTGATGAGC TCATATCCTA TCANCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC 24 0 

GGGTATCTGG TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT 3 00 

ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGAAA CCCNGACGCA GCCACGCCGC 36 0 

40 

GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAACGAGCAA 42 0 

TCGTTCGGAC GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA 480 

45 TACGAAGGTG GCAAGCGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA GGCGGTTTGG 54 0 

TAAGCCTTCT GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG CAGAGGGTAC TGCCAGGCTA 600 

GAGGGTGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 66 0 

50 

GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATACCTGA CGCTCAGACA CGAAAGCGTG 72 0 

GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAACTATGGA TACTAAGTGT 78 0 

55 CGGCGGGTTA CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC 84 0 

CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 90 0 
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TAATTCGACG CAACG CGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA 96 0 

AAGCCTGACG AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 102 0 

GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTGCCG 108 0 

GGTCATGCCG AACACTCTGA AAGGACTGCC CAGGATAACG GGGAAGGAAG GTGGGGATGA 114 0 

10 CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 12 0 0 

AACGCTGCAA ACCCGTGAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCANATTGAG 12 6 0 

GTCTGCAACT CGACCTCATG AATGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG 13 2 0 

TGAATACGTN CCCGGGCCTT GTACACGCCG CCCGTCACAC CACGAAAGCC TGTTGTACCT 13 8 0 

GAAGTCGCCC AAGCCAACCG CAAGGAGGCA NGCGCCCACG GTATGGCCGG TGATG 14 3 5 
20 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



30 



35 



(ii) MOLECULE TYPE: other nucleic acid 

{A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

40 CGGGAGGGAA GATGGAGC 

(2) INFORMATION FOR SEQ ID NO : 15: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer' 

(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 



18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

5 CCAACCCGGA AAGCGCAGAG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI -SENSE: NO 



25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGCCTGGCAG TACCCTCT 

(2) INFORMATION FOR SEQ ID NO: 17: 

30 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



40 



45 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

50 CAGCCGGGAG GAAAAGCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
10 (A) ORGANISM: Magnetobacterium bavaricum 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TGTAGGGAAA GATGATGA 18 
(2) INFORMATION FOR SEQ ID NO: 19: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

30 ( iv) ANTI-SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



35 



40 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGTGCGGGAA GATAATGA 18 
(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

55 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CGGGTGGGAA GAACAAAA 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira marina 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CATGAGGAAA GATAAAGT 

(2) INFORMATION FOR SEQ ID NO : 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CGGCAGGGAA GATGGAAC 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CGGGAGGGAA GATGGAGC 18 
(2) INFORMATION FOR SEQ ID NO: 24: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



30 



35 



55 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

40 CCGCAGGGAA GATGGAAC 

(2) INFORMATION FOR SEQ ID NO : 25: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



18 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 5 

5 

CGGGAGGGAA GATGGAAC 

(2) INFORMATION FOR SEQ ID NO : 26: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 



25 



30 



40 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
CGTGCGGGAA GATAATGA 

(2) INFORMATION FOR SEQ ID NO : 27: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



45 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 

CGGCAGGGAA GATGGAAC 

(2) INFORMATION FOR SEQ ID NO : 28: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
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44 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CGGGAGGGAA GATGGACG 18 
20 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 
25 (C> STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
35 (A) ORGANISM: Nitrococcus mobilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 29: 
TCAACCTGGG AATTGCATCC 2 0 

(2) INFORMATION FOR SEQ ID. NO: 30: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI -SENSE: NO 

(vi) .ORIGINAL SOURCE: 
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20 



25 



45 



50 



45 

(A) ORGANISM: Magnetobacterium bavaricum 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 30: 
TCAACCCGGG AATTGCCTTG 
(2) INFORMATION FOR SEQ ID NO : 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 

30 TCAACTCCAG AACTGCCTTT 

(2) INFORMATION FOR SEQ ID NO : 32: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TCAACCGTGG AATTG CGTTT 
55 (2) INFORMATION FOR SEQ ID NO : 33: 
(i). SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 (iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina marina 



15 



20 



30 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TTAACCGGGA AAGGTCGAGA 2 0 

(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 
CTAACCCGGA AAGTGCGGAG 
(2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( iii ) HYPOTHETICAL : NO 

55 

(iv) ANTI-SENSE: NO 



20 
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30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CCAACCCGAA AAGCGCAGAG 2 0 

10 <2) INFORMATION FOR SEQ ID NO : 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 
25 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CCAACCCGGA AAGCGCAGAG 2 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

45 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TCAACTCCAG AACTGCCTTT 20 

55 

(2) INFORMATION FOR SEQ ID NO: 38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

20 CCAACCCGGA AAGCGCAGAG 

(2) INFORMATION FOR SEQ ID NO : 39: 

( i ) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39: 
AGCCAAACAG TATCGGAT 
45 (2) INFORMATION FOR SEQ ID NO : 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
55 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Magnetobacterium bavaricum 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 40: 
AG TTAAACAG TTTTCAAG 

(2) INFORMATION FOR SEQ ID NO : 41: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

AGACCTTCAG TATCAAAG 

(2) INFORMATION FOR SEQ ID NO: 42: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
55 AGCCGAATAG TTTCAAAC 

(2) INFORMATION FOR SEQ ID NO: 43: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL : NO 

(iv) ANT I - SENSE : NO 

(vi) ORIGINAL SOURCE: 
15 (A) ORGANISM: Nitrospina marina 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
AGCTGAATAG TTCCTCTC 18 
(2) INFORMATION FOR SEQ ID NO: 44: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

35 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
AGCCGAGCAG TCCCCTCC 18 
(2) INFORMATION FOR SEQ ID NO : 45: 



45 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 45: 

10 AGCCTGGCAG TACCCTCT 

(2) INFORMATION FOR SEQ ID NO : 46: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



25 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
AGCCTGGCAG TACCCCCT 
35 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

45 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
50 (A) ORGANISM: Nitrospira 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 47: 
AGCCTGGCAG TACCGTCT 
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(2) INFORMATION FOR SEQ ID NO : 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

AGATCCTCAG TATCAAAG 

(2) INFORMATION FOR SEQ ID NO: 49: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



35 



40 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

45 AGCCTGGCAG TACCCTCT 

(2) INFORMATION FOR SEQ ID NO : 50: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

10 CCTGTGCTCC ATGCTCCG 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI - SENSE : NO 



25 



55 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CCTGTGCTCC ATGCTCCG 
35 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

45 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
50 (A) ORGANISM: Nitrospina gracilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CCTGTGCAAG GGCCCCGA 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



10 



20 



25 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
CCTGTCATCC GGTTCCCG 

(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 



40 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
45 CCTGAGCACG CTGGTATT 

(2) INFORMATION FOR SEQ ID NO : 55: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii). HYPOTHETICAL: NO 



50 



(i) 
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(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina marina 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
CCTGAGCTCG CTCCCCTT 

(2) INFORMATION FOR SEQ ID NO: 56: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

25 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Magnet obacterium bavaricum 



30 



35 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 
CCTGTGCAAG CTCTCCCT 

(2) INFORMATION FOR SEQ ID NO : 57: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18. base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

50 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 57: 

CCTGAGCAGG ATGGTATT 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 <ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



15 



45 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 58: 
CCTGAGCACG CTGGTATT 18 
25 (2) INFORMATION FOR SEQ ID NO : 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 ( iii ) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
40 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 59: 
CCTGAGCAGG ATGGTGTT 18 
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CLAIMS 

1 . A consortium of microorganisms capable of nitrite oxidation in wastewater, which consortium 
is enriched in members of the Nitrospira phylum. 

2. An oligonucleotide primer for PCR amplification of Nitrospira DNA, said primer comprising 
5 at least 12 nucleotides having a sequence selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

3. The oligonucleotide primer of claim 2, wherein said primer has a length of 12 to 50 
10 nucleotides. 

4. The oligonucleotide primer of claim 2, wherein said primer has a length of 12 to 22 
nucleotides. 

5. The oligonucleotide primer of claim 2, wherein said primer sequence is selected from the 
group consisting of SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16. 

15 6. A primer pair for PCR amplification of Nitrospira DNA, said primer pair comprising: 

(a) a first oligonucleotide of at least 12 nucleotides having a sequence selected from one 
strand of a bacterial 16S rDNA gene; and 

(b) a second oligonucleotide of at least 12 nucleotides having a sequence selected from the 
other strand of said 16S rDNA gene downstream of said first oligonucleotide sequence; wherein at 

20 least one of said first and second oligonucleotides is selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

7. The primer pair of claim 6, wherein said first and second oligonucleotide primers 
25 independently have lengths of 12 to 50 nucleotides. 

8. The primer pair of claim 6, wherein said first and second oligonucleotide primers 
independently have lengths of 12 to 22 nucleotides. 

9. The primer pair of claim 6, wherein said first oligonucleotide primer sequence is selected 
from the group consisting of SEQ ID NO: 14 and SEQ ID NO: 15, and said second oligonucleotide 

30 primer sequence is SEQ ID NO: 16. 

10. A probe for detecting Nitrospira DNA, said probe comprising at least 12 nucleotides having a 
sequence selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
35 ID NO: 13. 
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1 1 . The probe of claim 10, wherein said probe has a length of 15 to 50 nucleotides. 

12. The probe of claim 10, wherein said probe has a length of 15 to 22 nucleotides. 

13. A kit comprising: 

at least one primer according to claim 2; 
5 at least one primer pair according to claim 6; or 

at least one probe according to claim 10. 

14. The kit of claim 13, wherein said kit further includes reagents selected from the group 
consisting of buffers, salts, detergents, nucleotides and thermostable polymerase. 

15. A method of detecting a Nitrospira species in a sample, said method comprising the steps of: 
10 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to 

claim 6; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

1 5 (d) detecting said amplification product. 

16. The method according to claim 15, wherein said amplification product has a length of 50 to 
1,400 bps. 

17. A method of quantitating the level of a Nitrospira species in a sample, said method comprising ' 
the steps of: 

20 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to 

claim 6; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

25 ( d ) detecting said amplification product and quantitating the level of said product by 

comparison with at least one reference standard. 

18. The method according to claim 17, wherein said amplification product has a length of 50 to 
1 ,400 bps. 

19. A method of detecting a Nitrospira species in a sample, said method comprising the steps of: 
30 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a labelled probe according to 
claim 4 under conditions which allow hybridisation of said genomic DNA said probe; 

(c) separating hybridised labeled probe and genomic DNA from unhybridised labeled 
probe; and 

35 (d) detecting said labeled probe-genomic DNA hybrid. 
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20. A method of detecting cells of a Nitrospira species in a sample, said method comprising the 
steps of: 

(a) treating cells in said sample to fix cellular contents; 

(b) contacting said fixed cells from step (a) with a labeled probe according to claim 10 
5 under conditions which allow said probe to hybridise with RNA within said fixed cell; 

(c) removing unhybridised probe from said fixed cells; and 

(d) detecting said labeled probe-RNA hybrid. 
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t : 

SBR1024- 
SBR1015- 

GC86 
SBR2046- 

RC2 5 

RC19 - 
SBR2016- 

RC7 

RC14 

RC9 9 - 

RC11 

RC7 3 - 

RC90 



50 ] 



... - - TCGACCTG CAGGCGGCCG C ACT AG TG AT 

-GC TCTCCCATAT GGTCGACCTG CAGGCGGCCG CACTAGTGAT 



[ 51 
SBR1024-- 



100 ] 



- T AATACAT 



SBR1015- 

GC8 6 TAGAGTTTGA TCCTGGCTCA GAACGAACGC TGGCGGCGCG CCTAATACAT 
SBR2046 

RC2 5 

RC19 
SBR2 016 

RC7 

RC14 



TAGAGTTTGA TCCTGGCTCA GAACGAACGC TGGCGGCGCG CCTAATACAT 



- TAATACAT 
-T AATACAT 



RC9 9 
RC11 
RC73 
RC90 



[ 101 
SBR1 0 2 4 - CAAGTCGAG 
SBR1 0 1 5 GCAAGTCGAG 

GC 8 6 GCAAGTCGAG 
SBR2046 

RC2 5 GCAAGTCGAG 

RC1 9 - -AAGTCGAG 
SBR2 0 1 6 GCAAGTCGAG 

RC7 GCAAGTCGAG 

RC14 

RC99 

RC11 

RC73 

RC90 



GCAAGTCGAT 
GCAAGTCGAT 
GCAAGTCGAT 
GCAAGTCGAT 



CGAGAAGACG TA GCAA . 

CGAGAAGACG TA GCAA. 

CGAGAAGACG TA GCAA. 

CGAGAAGACG TA GCAA. 

CGAGAAGACG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGAGAAGGTG TA GCAA. 

CGANAAGGTG TA GCAA. 

CGANAAGGTG TA GCAA. 



CCTAATACAT 

AATACAT 

AATACAT 

- -TAATACAT 
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SBR1 0 2 4 CGTTTGTAAA 


GCGGC .... 


. . . GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


SBR1 0 1 5 CGTTTGTAAA 


GCGGC .... 


. . .GAACGGGT 


GAGGAATACA 


TGG G TAG C CT 


GC8 6 


CGTTTGTAAA 


GCGGC. . . . 


. . .GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


SBR2 04 6 CGTTTGTAAA 


GCGGC. . . . 


. . .GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


RC2 5 


CGTTTGTAAA 


GCGGC. . . . 


GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC19 


CACTTGTAAA 


VJ\_www .... 






r rdcza r r a zx tpt 

J. VjVj^j ± rinl \_ X 


SBR2 016 CACTTGTAAA 


GCGGC .... 


. . . GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC7 


CACTTGTAAA 


GCGGC. . . . 


. . .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC14 


CACTTGTAAA 


GCGGC .... 


. . . GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC99 


CACTTGTAAA 


GCGGC .... 


. . .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC11 


CACTTGTAAA 


GCGGC .... 


. . . GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC73 


CACTTGTAAA 


GCGGC .... 


. . . GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC9 0 


CACTTGTAAA 


GCGGC .... 


. . .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 



[ 201 

SBR1 0 2 4 ACCTTCGAGT 

SBR1 0 1 5ACCCTCGAGT 
GC8 6 ACCCTCGAGT 

SBR2 0 4 6 ACCCTCGAGT 
RC2 5 ACCATCGAGT 
RC 1 9 ACCATCGAGT 

SBR2 0 1 6 ACCATCGAGT 
RC7 ACCATCGAGT 
RC14 ACCATCGAGT 
RC 9 9 ACCATCGAGT 
RC1 1 ACCATCGAGT 
RC73 ACCATCGAGT 
RC 9 0 ACCATCGAGT 

[ 251 

SBR 1 0 2 4 ACTCCTGGTC 

SBR1 0 1 5GCTCCTGGTC 
GC86 ACTCCTGGTC 

SBR2 0 4 6 GCTCCTGGTC 
RC2 5 CTTCTGAGTC 
RC19 CTTCCGAGTC 

SBR2 016 CTTCTGAGCC 
RC7 CCTCCGAGTC 
RC14 CCTCCGAGTC 
RC99 CCTCCGAGTC 
RC11 " CCTCCGAGTC 
RC73 CCTCCGAGTC 
RC 9 0 CTTCCGAGTC 



GGGGAATAAC TAGCCGAAAG 
GGGGAATAAC TAACCGAAAG 
GGGGAATAAC TAGCCGAAAG 
GGGGAATAAC TAACCGAAAG 
GGGGAATAAC C AAC CGAAAG 
GGGGAATAAC CAGCCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 
GGGGAATAAC CAACCGAAAG 



. TGC . 


. GGAT 


CGGGAGAGAA 


. TGC . 


.GGAT 


CGGGAGAGAA 


.TGC. 


.GGAT 


CGGGAGAGAA 


.TGC. 


.GGAT 


CGGGAGAGAA 


. TTC . 


. GGGT 


TCGGAAGGAA 


. TTC . 


. GGGC 


TTGGAAGGAA 


.TTC. 


. GTGT 


TCGGAAGGAA 


.TTC . 


.GGGT 


TCGGAGGGAA 


.TTC. 


.GGGT 


TCGGAGGGAA 


.TTC. 


.GGGT 


TCGGAGGGAA 


.TTC . 


.GGGT 


TCGGAGGGAA 


.TTC. 


.GGGT 


TCGGAGGGAA 


.TTC. 


.GGGC 


TTGGAAGGAA 



250 ] 

GTTAGCTAAT AC CG CATACG 
GTTAG CTAAT ACCGCATACG 
GTTAGCTAAT ACCGCATACG 
GTTAGCTAAT ACCGCATACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 
GTTGGCTAAT ACCGCGTACG 



300 ] 

AGCGATACC GTG . 

AGCGATACC GTG . 

AGCGATACC GTG . 

AGCGATACC GTG . 

AGCCGTACT GTG . 

AGCCGCACT GTG . 

AGCCGTACT GTG . 

AGCTGCACT GTG . 

AGCTGCACT GTG . 

AGCTGCACT GTG . 

AGCTGCACT GTG . 

AGCTGCACT GTG . 

AGCCGCACT GTG . 
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[ 301 350 

SBR1024 GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 

SBR1015 GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 

GC86 GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 

SBR2046 GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 
RC25 AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC19 AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

SBR2016 AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC7 AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC14 AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC " AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC11 AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC73 AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 

RC90 AGTGC GGCGCTCTTT GATGAGCTCA TATCCTATCA NCTTGTTGGT 

[ 351 400 
SBR1024GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
SBR1015GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
GC86 GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
SBR2046GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
RC25 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC19 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
SBR2 0 1 6 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC7 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC14 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC99 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC11 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC73 AGGGTAACGG CCTACCAAGG CTTTGACGGG TATCTGGTCT GAGAGGACGA 
RC90 AGGGTAACGG CCTACCAAGG CTTTGACGGG TATCTGGTCT GAGAGGACGA 

[ 401 45Q 
SBR 1 0 2 4 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
S BR 1 0 1 5 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
GC86 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
SBR2 04 6TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC25 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC19 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
SBR2 0 1 6TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC7 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC14 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC99 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC11 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC73 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC90 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
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[ 451 500 ] 

S B R 1 0 2 4 GTAAGGAATA TTGCGCAATG GGC . GACAGC CTGACGCAGC NACGCCGCGT 
S B R 1 0 1 5 GT AAGG AATA TTGCGCAATG GGC . GACAGC CTGACGCAGC NACGCCGCGT 
GC86 GTAAGGAATA TTGCGCAATG GGC. GACAGC CTGACGCAGC NACGCCGCGT 
S B R 2 0 4 6 GTAAGGAATA TTGCGCAATG GGC . GACAGC CTGACGCAGC GACGCCGCGT 
RC2 5 GTAAGGAATA TTGCGCAATG GGC . GAAAGC CTGACGCAGC NACGCCGCGT 
RC19 GTAAGGAATA TTGCGCAATG GGC . GAAAGC CTGACGCAGC GACGCCGCGT 
S B R 2 0 1 6 GTAAGGAATA TTGCGCAATG GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
RC7 GTAAGGAATA TTGCGCAATG GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
RC14 GTAAGGAATA TTGCGCAATG GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
RC99 GTAAGGAATA TTGCGCAATG GGC. GAAAGC CTGACGCAGC CACGCCGCGT 
RC11 GTAAGGAATA TTGCGCAATG GGC . GAAAGC CTGACGCAGC CACGCCGCGT 
RC73 GTAAGGAATA TTGCGCAATG GGC . GAAACC CNGACGCAGC CACGCCGCGT 
RC90 GTAAGGAATA TTGCGCAATG GGC . GAAACC CNGACGCAGC CACGCCGCGT 

[ 501 550 ] 

S B R 1 0 2 4 GGGGG ATG AA GGTC . TTCGG ATTGTAAAC C CCTTTCGGCA GGG AAGATGG 
SBR1 0 1 5 GGGGG ATGAA GGTC . TTCGG ATTGTAAAC C CCTTTCGGCA GGGAAGATGG 
GC86 GGGGGATGAA GGTC . TTCGG ATTGTAAAC C CCTTTCGGCA GGGAAGATGG 
SBR2 04 6TGGGGATGAA AGTC . TTCCG ATTGTAAAC C CCTTTCCGCA GGGAAGATGG 
RC2 5 GGGGGATGAA GGTC . TTCGG ATTGTAAAC C CCTTTCGGGA GGGAAGATGG 
RC19 GGGGGATGAA GGTC. TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
SBR2 0 1 6 GGGGGATGAA GGTC . TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC7 GGGGGATGAA GGTC. TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC14 GGGGGATGAA GGTC. TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC99 GGGGGATGAA GGTC. TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC11 GGGGGATGAA GGTC . TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC7 3 GGGGGATGAA GGTC . TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
RC90 GGGGGATGAA GGTC. TTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG 



[ 551 

SBR1024AACGG GTAA 

SBR1 0 1 5 AACGG GTAA 

GC86 AACGG GTAA 

SBR2 04 6AACGG GTAA 

RC2 5 AGCGA GCAA 

RC19 AGCCA GCAA 

SBR2 0 1 6 AGCGA GCAA 

RC7 AGCGA GCAA 

RC14 AGCGA GCAA 

RC99 AGCGA GCAA 

RC1 1 AGCGA GCAA 

RC73 AACGA GCAA 

RC90 AACGA GCAA 



600 ] 

CCGTTCG GACGGTACCT G C AGAAGC AG 
CCGTTCG GACGGTACCT GCAGAAGCAG 
CCGTTCG GACGGTACCT GCAGAAGCAG 
CCGTTCG GACGGTACCT GCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
TCGTTCG GACGGTACCT CCAGAAGCAG 
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SBR1024CCACGGCTAA CTTCGTGCCA 
S B R 1 0 1 5 CCACGGCTAA CTTCGTGCCA 
GC86 CCACGGCTAA CTTCGTGCCA 
SBR2 04 6CCACGGCTAA CTTCGTGCCA 
RC2 5 CCACGGCCAA CTTCGTGCCA 
RC19 CCACGGCCAA CTTCGTGCCA 
. SBR2 016 CCACGGCCAA CTTCGTGCCA 
RC7 CCACGGCCAA CTTCGTGCCA 
RC14 CCACGGCCAA CTTCGTGCCA 
RC99 CCACGGCCAA CTTCGTGCCA 
RC11 CCACGGCCAA CTTCGTGCCA 
RC73 CCACGGCCAA CTTCGTGCCA 
RC90 CCACGGCCAA CTTCGTGCCA 

[ 651 

SBR1 0 2 4 TTGTTCGGAT TTACTGGGCG 

SBR1 0 1 5TTGTTCGGAT TTACTGGGCG 
GC86 TTGTTCGGAT TTACTGGGCG 

SBR2 04 6TTGTTCGGAT TTACTGGGCG 
RC2 5 TTGTTCGGAT TCACTGGGCG 
RC19 TTGTTCGGAT TCACTGGGCG 

SBR2 0 1 6TTGCTTGGAT TCACTGGGCG 
RC7 TTGTTCGGAT TCACTGGGCG 
RC14 TTGTTCGGAT TCACTGGGCG 
RC99 TTGTTCGGAT TCACTGGGCG 
RC11 TTGTTCGGAT TCACTGGGCG 
RC73 TTGTTCGGAT TCACTGGGCG 
RC90 TTGTTCGGAT TCACTGGGCG 

[ 701 

SBR1024TCCGTGAAAT CTCCGGGCCT 

SBR1015TCCGTGAAAT CTCCGGGCCT 
GC86 TCCGTGAAAT CTCCGGGCCT 

SBR204 6TCCGTGAAAT CTCCGGGCCT 
RC25 TCTGTTAAAG CTTCGGGCCC 
RC19 TCTGTTAAAG CTTCGGGCCC 

SBR2 0 1 6 TCTGTTAAAG CTTCGGGCCC 
RC7 TCTGTTAAAG CTTCGGGCCC 
RC14 TCTGTTAAAG CTTCGGGCCC 
RC9 9 TCTGTTAAAG CTTCGGGCCC 
RC11 TCTGTTAAAG CTTCGGGCCC 
RC73 TCTGTTAAAG CTTCGGGCCC 
RC90 TCTGTTAAAG CTTCGGGCCC 
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650 

GCAGCCGCGG TAATACGAAG GTGGC AAGCG 
GCAGCCGCGG TAATACGAAG GTGGC AAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 

700 

TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTANGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 

750 

AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AAC C CGG AAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GCGCAGACGG TACTGC CAGG 
AACCCGGAAA GCGCAGAGGG TACTG C CAGG 
AACCCGAAAA GCGCAGAGGG TACTGC CAGG 
AACCCGGAAA GCGCAGGGGG TACTG C CAGG 
AACCCGGAAA GCGCAGAGGG TACTGC CAGG 
AACCCGGAAA GCGCAGAGGG TACTG C CAGG 
AACCCGGAAA GCGCAGAGGG TACTGC CAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
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[ 751 800 ] 

S B R 1 0 2 4 CTAG AGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
SBR1 0 1 5CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
GC86 CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
SBR2 04 6 CTAG AGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC2 5 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC19 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
S BR2 0 1 6 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC7 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC14 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC9 9 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC11 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC7 3 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 
RC9 0 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 



[ 801 

S BR 1 0 2 4 TAGAGATCGG GAGGAAGGCC 

S B R 1 0 1 5 TAGAGATCGG GAGGAAGGCC 

GC86 TAGAGATCGG GAGGAAGGCC 

SBR2 04 6 TAGAGATCGG GAGGAAGGCC 

RC2 5 TAGAGATCGG GAGGAAGGCC 

RC19 TAGAGATCGG GAGGAAGGCC 

S B R 2 0 1 6 TAGAGATCGG GAGGAAGGCC 

RC7 TAGAGATCGG GAGGAAGGCC 

RC14 TAGAGATCGG GAGGAAGGCC 

RC9 9 TAGAGATCGG GAGGAAGGCC 

RC11 TAGAGATCGG GAGGAAGGCC 

RC73 TAGAGATCGG GAGGAAGGCC 

RC90 TAGAGATCGG GAGGAAGGCC 

[ 851 

S BR 1 0 2 4 TGACGCTGAG GCTCGAAAGC 

SBR1 0 1 5TGACGCTGAG GCTCGAAAGC 
GC86 TGACGCTGAG GCTCGAAAGC 

S B R 2 0 4 6 TG ACG CTG AG GCTCGAAAGC 
RC2 5 TGACGCTGAG ACACGAAAGC 
RC19 TGACGCTGAG ACACGAAAGC 

SBR2 0 1 6 TGACGCTGAG ACACGAAAAC 
RC7 TGACGCTGAG ACACGAAAGC 
RC14 TGACGCTGAG ACACGAAAGC 
RC9 9 TGACGCTGAG ACACGAAAGC 
RC11 TGACGCTGAG ACACGAAAGC 
RC73 TGACGCTGAG ACACGAAAGC 
RC90 TGACGCTCAG ACACGAAAGC 



850 ] 

GGTGG CGAAG GCGGCGCTCT GG AACATTTC 
GGTGGCGAAG GCGGCGCTCT GG AACATTTC 
GGTGGCGAAG GCGGCGCTCT GGAACATTTC 
GGTGGCGAAG GCGGCGCTCT GGAACATTTC 
GGTGGCGAAG GCGGCGCTCT GG AAC AT AC C 
GGTGGCGAAG GCGGCGCTCT GGAACATGCC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 
GGTGGCGAAG GCGGCGCTCT GGAACATACC 

900 ] 

GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
GTGGGGNG C A AACAGGATTA GATACCCTGG 
GTGGGGAGCA AACAGGATTA GATACCCTGG 
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[ 901 

SBR1 0 2 4 TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG. 
SBR1 0 1 5TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG. 

GC86 TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG. 
SBR2 04 6TAGTCCACGG CTTAAACGAT GGATACTAAG TGTCGGCGG. 

RC25 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC19 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 
SBR2 0 16TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC7 TAGTCCACGC CCTAAGCTAT GGATACTAAG TGTCGGCGG. 

RC14 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC99 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC11 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC73 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 

RC90 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG. 



950 ] 



t 951 1000 

SBR1024 G TTA CCGCCGGTG CCGCAGCTAA 

SBR1015 G TTA CCGCCGGTG CCGCAGCTAA 

GC8 6 G TTA CCGCCGGTG CCGCAGCTAA 

SBR2 04 6 G TTA CCGCCGGTG CCGCAGCTAA 

RC25 G TTA CCGCCGGTG CCGCAGCTAA 

RC19 G TTA CCGCCGGTG CCGCAGCTAA 

SBR2 016 .G TTA. . CCGCCGGTG CCGCAGCTAA 

RC7 G TTA CCGCCGGTG CCGCAGCCAA 

R C14 G TTA CCGCCGGTG CCGCAGCTAA 

RC " G TTA CCGCCGGTG CCGCAGCTAA 

RC11 G TTA CCGCCGGTG CCGCAGCTAA 

RC73 • G TTA CCGCCGGTG CCGCAGCTAA 

RC90 G TTA. CCGCCGGTG CCG CAG CTAA 

[ 1001 1050 

SBR 1 0 2 4 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR 1 0 1 5 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

GC8 6 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR2 0 4 6 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC2 5 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC19 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR2 016 CGCATTAAGT ATCCCGCCTG GGAGGTACGG CCGCAAGGTT GAAACTCAAA 

RC7 CGCGTTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC14 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC9 9 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC11 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC73 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

RC90 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
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[ 1051 1100 ] 

S B R 1 0 2 4 GG AATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

SBR1 0 1 5GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

GC86 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

SBR2 0 4 6 GG AATTGACG GGGCCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC2 5 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC19 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

SBR2 0 1 6 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCTTGTGGT TTAATTCGAC 

RC7 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC14 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC99 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC11 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 

RC73 GGGATTGACG GGGGCCCGCA CAAGCGGTGG GGCATGTGGT TTAATTCGAC 

RC90 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 



[ 1101 1150 ] 

SBR1024GCAACGCGAA GAACCTTA . C CCAGGCTGGA CATG CAGGTAG 

SBR1 0 1 5GCAACGCGAA GAACCTTA. C CCAGGCTGGA CATG CAGGTAG 

GC8 6 GCAACGCGAA GAACCTTA. C CCAGGCTGGA CATG CAGGTAG 

SBR2 0 4 6GCAACGCGAA GAACCTTA. C C C AGGCAGGA CATG CAGGTAG 

RC2 5 GCAACGCGAA GAACCTTA. C C C AGGTTGG A CATG CACGTAG 

RC19 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

SBR2 016GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

RC7 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

RC14 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

RC99 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG... CACGTAG 

RC11 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

RC7 3 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

RC90 GCAACGCGAA GAACCTTA. C CCAGGTTGGA CATG CACGTAG 

[ 1151 1200 ] 

SBR1 024 TAGAAGGGT . . GAAA . . GCC TAACGAGGTA GCAA TACCAT 

SBR1 0 1 5TAGAAGGGT . .GAAA. .GCC TAACGAGGTA GCAA TACCAT 

GC 8 6 TAGAAGGGT . . GAAA . . GCC TAACGAGGTA GCAA CACCAT 

SBR2 0 4 6 TAGAAGGGT . . GAAA . . GCC TAACGAGGTA GCAA . .... TACCAT 

RC2 5 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC19 TAGAAAGGT. . GAAA GNC TAACGAGGTA GCAA TACCAG 

SBR2 0 1 6 TAGAAAGGT . .GAAA. .GCC TGACGAGGTA GCAA TACCAG 

RC7 TAGAAAGGT. .GAAA. .GCC TGACGAGGTA GCAA TACCAG 

RC1 4 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC 9 9 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC 1 1 TANAAAGGT . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC7 3 TNGAAAGGT. . GAAA .. GCC TGACGAGGTA GCAA TACCAG 

RC 9 0 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 
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t 1201 

S B R 1 0 2 4 CCTGCTCAGG 

SBR1 0 1 5 CCTGCTCAGG 
GC8 6 CCTGCTCAGG 

S B R 2 0 4 6 CCTGCTCAGG 
RC2 5 CGTGCTCAGG 
RC 1 9 CGTGCTCAGG 

SBR2 0 1 6 CGTGCTCAGG 
RC 7 CGTGCTCAGG 
RC14 CGTGCTCAGG 
RC99 CGTGCTCAGG 
RC11 CGTGCTCAGG 
RC7 3 CGTGCTCAGG 
RC90 CGTGCTCAGG 

[ 1251 

SBR1 024 GTTAAGTCCC 

SBR1 015 GTTAAGTCCC 
GC 8 6 GTTAAGTCCC 

SBR2 04 6 GTTAAGTCCC 
RC2 5 GTTAAGTCCC 
RC1 9 GTTAAGTCCC 

SBR2 0 16 GTTAAGTCCC 
RC7 GTTAAGTCCC 
RC 1 4 GTTAAGTCCC 
RC9 9 GTTAAGTCCC 
RC1 1 GTTAAGTCCC 
RC 7 3 GTTAAGTCCC 
RC90 GTTAAGTCCC 

[ 1301 
SBR1024GTCATG. . . . 
SBR1 0 1 5GTCATG . . . . 

GC86 GTCATG . . . . 
SBR2 0 4 6GTCATG . . 

RC2 5 GTCATG. . 

RC19 GTCATG. . 
SBR2 0 1 6 GTCATG . . 

RC7 GTCATG . . 

RC14 GTCATG. . 

RC99 GTCATG. . 

RC11 GTCATG.. 

RC73 GTCATG.. 

RC90 GTCATG.. 



TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 
TGCTGCATGG 



CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCGTCAG 
CTGTCTTCAG 
CTGTCGTCAG 
CTGTCGTCAG 



CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 
CTCGTGCCGT 



GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 
GCAACGAGCG 



CAACCCCTGT 
CAACCCCTGT 
CAACCCCTGT 
CAACCCCTGT 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 
CAACCCCTGC 



CTTCAGTTAC 
CTTCAGTTAC 
CTTCAGTTAC 
CTTCAGTTAC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 
TTTCAGTTGC 



CCGGGAACTC 
CCGGGAACTC 
CCGGGAACTC 
CCGGGAACTC 
CCGAGCACTC 
CCGAGCACTC 
CCGAGCACTC 
CCGAGCACTC 
CCGAGCACTC 
CCGAGCACTC 
CCGAACACTC 
CCGAACACTC 
CCGAACACTC 



TGGAGAGACT 
TGGAGAGACT 
TGGAGAGACT 
TGGAGAGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 
TGAAAGGACT 



GCCCAGGAGA 
GCCCAGGAGA 
GCCCAGGAGA 
GCCCAGGAGA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 
GCCCAGGATA 



1250 ] 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 
GAGGTGTTGG 

1300 ] 
CAACGG. . 
CAACGG . . 
CAACGG. . 
CAACGG. . 
TACCGG . . 
TACCGG . . 
TACCGG. . 
TACCGG. . 
TACCGG. . 
TACCGG. . 
TACCGG. . 
TACCGG. . 
TGCCGG . . 

1350 ] 
ACGGG . GAGG 
ACGGGGGAGG 
ACGGG . GAGG 
ACGGG . GAGG 
ACGGG . GAGG 
ACGGG . GAGG 
ACGGG . GAGG 
ACGGGGGAGG 
ACGGG . GAGG 
ACGGGGAAGG 
ACGGGGAAGG 
ACGGGGAAGG 
ACGGGGAAGG 
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[ 1351 1400 ] 

S B R 1 0 2 4 AAGGTGGGG A TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
SBR1 0 1 5AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
GC8 6 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
SBR2 0 4 6 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC2 5 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC19 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
SBR2 0 1 6 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC7 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC14 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC99 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC11 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 
RC73 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATACCT GGGGCCACAC 
RC9 0 AAGGTGGGGA TGACGTCAAG TCAGCATGGC CTTTATGCCT GGGGCCACAC 

[ 1401 1450 ] 

SBR1 024ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 

S BR 1 0 1 5 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 

GC86 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 

SBR2 04 6 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 

RC25 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 

RC19 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 

SBR2 0 1 6 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 

RC7 ACGTGCTACA ATGGCCGGTA CAAAACGCTG CAAACCC.GT GAGGGGGAGC 

RC14 ACGTGCTACA ATGGCCGGTA TAAAACGCTG CAAACCC.GT GAGGGGGAGC 

RC9 9 ACGTGCTACA ATGGCCGGTA CAAAACGCTG CAAACCC.GT GAGGGGGAGC 

RC11 ACGTGCTACA ATGGCCGGTA CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 

RC73 ACGTGCTACA ATGGCCGGTA CAAAACGCTG CAAACCC.GT GAGGGGGAGC 

RC9 0 ACGTGCTACA ATGGCCGGTA CAAAACGCTG CAAACCC.GT GAGGGGGAGC 

[ 1451 1500 ] 

SBR 1 024 CAATCCCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
SBR1015CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
GC86 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
SBR2 04 6 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC2 5 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC19 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
SBR2 016 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC7 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC14 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC99 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC11 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC7 3 CAATCGCAAA AAACCGGCCT CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
RC90 CAATCGCAAA AAACCGGCCT CAGTTCANAT TGAGGTCTGC AACTCGACCT 
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[ 1501 1550 j 

SBR1 0 2 4 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
SBR1 0 1 5CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
GC8 6 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
S B R 2 0 4 6 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
RC2 5 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC19 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG. CACGC CGCGGTGAAT 
SBR2 0 1 6 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC7 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC14 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC99 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC11 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC73 CATGAATGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC90 CATGAATGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 

t 1551 1600 ] 

SBR 1 0 2 4 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG TTTGTTG 
SBR1 0 1 5ACGTTCCCGG ACCTTGTACA CACCGCCCGT CACACCACGA AAGTTTGTTG 
GC86 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGTTTGTTG 
SBR2 04 6 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGTTTGTTG 
RC2 5 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG CCTGTTG 
RC19 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
SBR2 0 1 6 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC7 ACGTTCCCGG GCCTTGTGCA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC14 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC9 9 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC11 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC73 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC90 ACGTNCCCGG GCCTTGTACA CGCCGCCCGT CACACCACGA AAGCCTGTTG 

[ 1601 

S B R 1 0 2 4 TACCTG AAGT CGTTGGCGCC AACC 
SBR1 0 1 5TACCTGAAGT CGTTGGCGCC AACC 

GC86 TACCTGAAGT CGTTGGCGCC AACC 
SBR2 04 6TACCTGAAGT CGTTGGCGCC AACC 

RC2 5 TACCTGAAGT CGCCCAAGCC AACC 

RC19 TACCTGAAGT CGCCCAAGCC AACC 
SBR2 0 1 6TACCTGAAGT CGCCCAAGCC AACC 

RC7 TACCTGAAGT CGCCCAAGCC AACC 

RC14 TACCTGAAGT CGCCCAAGCC AACC 

RC99 TACCTGAAGT CGCCCAAGCC AACC 

RC11 TACCTGAAGT CGCCCAAGCC AACC 

RC73 TACCTGAAGT CGCCCAAGCC AACC 

RC90 TACCTGAAGT CGCCCAAGCC AACC 

Fig. 8 (continued) 



1650 ] 

GCAA. ..... GGAGGCAGAC 

GCAA GGAG 

GCAA GGGGGCAGAC 

GCAA GGAGGCAGAC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GAAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCANGC 
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[ 1651 1700 ] 

SBR 1 0 2 4 GCCCACGGTA TGACCGATGA TTGGG 

SBR1015 

GC86 GCCCACGGTA TGACCGATGA TTGGGGTGAA GTCGTAACAA GGTAACCGTA 

S BR2 04 6 GCCCACGGTA TGACCGATGA TTGGGG 

RC2 5 GCCCACGGTA TGGCCCGTGA TTGGGGTGAA GTCGTAACAA GGTAACCGTA 

RC19 GCCCACGGTA TGGC CGGTGA TTGGGGTGAA GTCCTAACA - 

SBR2 01 6GCCCACGGTA TGGC 

RC7 GCCCACGGTA TGGCCG 

RC14 GCCCACGGTA TGG C CGGTGA T 

RC99 GCCCACGGTA TGG C CGGTGA 

RC11 GCCCACGGTA TGGCCGGTGA TGGGG 

RC73 GCCCACGGTA TGGCCGGTGA TGGGG 

RC9 0 GCCCACGGTA TGGCCGGTGA TG... 

[ 1701 1750 ] 

SBR1024 

SBR1015 : 

GC86 ATC 

SBR2046 

RC2 5 AA- 

RC19 -r 

SBR2016 

RC7 

RC14 

RC99 

RC11 

RC73 

RC90 
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